02 July, 2019

Database Architecture

Database is a systematic collection of data. Databases support storage and  manipulation of data. Databases make data management easy.
Microsoft SQL Server databases contain two primary file types required for proper functioning of a OS. They are: Data file and a Log file.


















Types of Database File Types:

Data Files:
Data files contain data and objects such as tables, indexes, stored procedures, and views. 
Based upon the requirements and usage, a Microsoft SQL Server database is further classified into two types of data files:

Primary data file
The Primary data file contains startup information for the database and points to other files in the database. User data and objects can be stored in this file and every database has one primary data file.
The recommended file extension for this file is .mdf.

Secondary data file
Secondary data files are optional, are user-defined, and store user data. Secondary files can be used to spread data across multiple disks by putting each file on a different disk drive. 
Additionally, if a database exceeds the maximum size for a single Windows file, you can use secondary data files so the database can continue to grow.
The recommended file name extension for secondary data files is .ndf.

Log File:
The transaction log files hold the log information that is used to recover the database. There must be at least one log file for each database. 
The recommended file name extension for transaction logs is .ldf.

Note: Above these operating system files, there are File-groups

File-groups:
File-groups work as a logical container for the data files and are used to increase database performance,simplify data placement and administrative tasks such as backup and restore operations.
There are two types of file groups in SQL Server, Primary and User-defined.

Primary File-group:
Every database has a primary file-group. This file-group contains the primary data file and any secondary files that are not put into other file-groups.
All pages for the system tables are allocated in the primary file group.

User-defined file-group:
User-defined file-groups can be created to group data files together for administrative, data allocation, and placement purposes.
We can have multiple file-groups per database. You can specify one file-group as the default, and objects not specifically assigned to a file-group will exist in the default. 
In a file-group, you can have multiple files. a file cannot be a member of more than one file-group.
Data files can be grouped together in file-groups for allocation and administration purposes. 
Log files are never part of a file-group because log space and data space are managed separately.

Why Should we Create Multiple File-groups?
There are two primary reasons for creating file-groups: performance and recovery.
File-groups that contain files created on specific disks can alleviate disk performance issues. 
For example, you may have one very large table in your database with a lot of read and write activity an orders table, perhaps. You can create a file-group, create a file in the file-group, and then move a table to the file-group by moving the clustered index.
If the file is created on a disk separate from other files, you are going to have better performance. This is similar to the logic behind separating data and log files in a database. Performance improves when you spread files across multiple disks because you have multiple heads reading and writing, rather than one doing all the work.

File-groups can be backed up and restored separately as well. This can enable faster object recovery in the case of a disaster. It can also help the administration of large databases.

File and File-group Fill Strategy:
When objects are created in the database without specifying which file-group they belong to, they are assigned to the default file-group.
At any time, exactly one file-group is designated as the default file-group. The files in the default file-group must be large enough to hold any new objects not allocated to other file-groups.
The PRIMARY file-group is the default file-group unless it is changed by using the ALTER DATABASE statement. Allocation for the system objects and tables remains within the PRIMARY file-group, not the new default file-group.

File-groups use a proportional fill algorithm\strategy across all the files within each file-group. As data is written to the file-group, the SQL Server Database Engine writes an amount proportional to the free space in the file to each file within the file-group, instead of writing all the data to the first file until full. It then writes to the next file. 
For example, if file f1 has 100 MB free and file f2 has 200 MB free, one extent is allocated from file f1, two extents from file f2, and so on. In this way, both files become full at about the same time, and simple striping is achieved.

As soon as all the files in a file-group are full, the SQL Server Database Engine automatically expands one file at a time in a round-robin manner to allow for more data, provided that the database is set to grow automatically. 
For example, a file-group is made up of three files, all set to automatically grow. When space in all the files in the file-group is exhausted, only the first file is expanded. When the first file is full and no more data can be written to the file-group, the second file is expanded. When the second file is full and no more data can be written to the file-group, the third file is expanded. If the third file becomes full and no more data can be written to the file-group, the first file is expanded again, and so on.

No comments:

Post a Comment