Deep understanding of SQL Server file storage pages and applications

Source: Internet
Author: User
Tags microsoft sql server microsoft sql server 2005

We use the database every day, our department uses the most relational database has SQL Server,Oracle, have you everwondered how these databases are stored in the operating system files? Sometimes in order to be able to design the optimal table structure, write high-performance SQL Server script, processing massive data concurrency, we must solve the underlying principle. Because of personal interest recently studied the file storage of the next SQL Server , due to the limited level, the following is only the smallest storage unit of SQL Server - page.

What is a page, zone?

What will have a page concept, we know that for the operating system, the file can be considered a large linear space, if the allocation of capacity in the Order of address space (that is, segment-based storage), it is possible to create a lot of external fragmentation, resulting in a lot of capacity is difficult to reuse, only mobile consolidation space to free up A lot of space. For example: all of the following table, if I now want to apply for 1024B bytes of space, it is obvious that the following two space space for a single calculation is not enough, combined is sufficient, can only move the merged space.

8KB

512B

12KB

512B

8KB

Allocated space

Free

Allocated space

Free

Allocated space

Table 1

In order to make better use of disk space,Sql serverBased on the concept of the operating system virtual memory, man-made file partition N 8KB of storage space, so each allocation, according to 8KB space application, solved the problem of external debris, that is to say The basic unit of data storage in SQL Server, with a page size of 8KB, and abyte header at the beginning of each page to store information about the page, including page numbers, page types, free space on the page, and allocation units that own the Page object ID. The above example allocation becomes the following table : This solves the external fragmentation problem.

Span style= "color: #2a2a2a;" >8kb

Industry allocation

12KB

8kb

XXKB

page unit

8kb

8kb

Span style= "color: #2a2a2a;" >8kb

idle

Table 2

Why there is the concept of the district, already have the unit of the page is not enough? Primarily to better manage these spaces,SQL Server divides every 8 pages into a single zone (as shown in the following table) like hundred dollar bills representing the million Yuan , go out to buy a lot of things, With hundred dollar bills than with a lot of 1 dollars to the aspect.

A partition

Page 1

Page 2

Page 3

Page 4

Page 5

Page 6

Page 7

Page 8

Table 3

To have a more specific understanding of the page, the following table is the structure of the header:

Figure -1

How do the rows be stored in the page?

So what form of data is stored in the data in the database? SQL Server is the data stored in the behavior unit, that is, each piece of data in the table (each row of data is a block) in the order of the page, then how to find the row? The start and end addresses of a line ? SQL Server stores the start address of each row in 2 bytes at the end of each page so that we can navigate to the beginning of the line and know the end of the record at the beginning of the next bar, so that we can get the data out of the line.

Figure -2

, if I want to take the second piece of data, then now read the page data into memory, and then read from the last read offset to start reading 2 bytes, how to find the beginning of line 2, the same can be read to remove the end of line 2 position.

How is the column stored in the page?

Now that we have read the line and are already in memory, how do we parse out all the columns in a row? Which is how the columns are stored? Columns in a database table are nothing more than two cases: fixed-length columns, variable-length columns.

First of all, assuming that only fixed-length columns, it is easy to think of the same in the order of each column in the row, because it is fixed-length, you can put the offset of each column in a separate place to store separately, if you want to take a particular column, the location of each column is easy to locate : The following table shows :

2 bytes

3 bytes

6 bytes

Ten bytes

3 bytes

2 bytes

1

23

55

A

C

D

Table -4

If you want to take the red data, then it's

Start position =(line start position)+ 2 bytes +3 bytes +6 bytes +10 bytes.

End position = start position + 3 bytes.

The length of each column can be fully stored in another table

Column

1

2

3

4

5

6

Length (bytes)

2

3

6

10

3

2

Table -5

Details of the specific line structure are as follows:

Figure -3

If the design table structure is :

Col1

Col2

Col3

Col4

Char (5) (NOT NULL)

INT (NULL)

Char (3) (NULL)

Char (6) (NOT NULL)

Table -6

Store the data in the database as :

Col1

Col2

Col3

Col4

' ABCDE '

' 123 '

' NULL '

' CCC '

Table -7

The data in the database file exists in the following form :

Figure -4

If there is a variable length column, how is this structure stored? The biggest difference between a variable-length column is that the length of each column is variable (the same column, each row has a different length), that is, it cannot be stored in another table. Then we can only put the length of the column in the row. This solves the problem of actual length positioning, as mentioned above, andSQL Server has a row offset matrix.

If we define the table structure as follows:

Col1

Col2

Col3

Col4

Col5

Char (2) (NOT NULL)

Varchar (NOT NULL)

Varchar (5) (NULL)

Varchar (NOT NULL)

Small int (NULL)

Table -8

If this line of data is:

Col1

Col2

Col3

Col4

Col5

'AAA'

Relicate (' X ', 250)

Null

' ABC '

123

Table -9

The data is actually stored in the database in the form of :


Figure -5

Conclusion:

1. in the database column as far as possible without nullable type, when the value is empty, the actual position is not occupied, and also cannot be an index key. It is possible to make a full table scan only when the where statement contains is NULL or is not null , and a nullable type can easily cause a null reference exception.

2. When designing a column, only the length of the column is determined, such as the identity card. Other situations should basically use the varchar edge length type, not only save space, but also the data stored on a page will become much more. Results in the same amount of data read pages less, reduce I/O, and improve performance.

3.-1 shows that clustered indexes are not stored in physical order, but are stored in logical physical order (most people will misunderstand here.) )

4. do not normally use varchar (max), because the data in this column is definitely not on a page, in order to solve this problem,SQL Server only stored a pointer in joins. The real data is placed on several other pages. The columns in each row are read at least one more time I/O, affecting performance.

Note, reference materials:

(1) Microsoft SQL Server 2005 Technology Insider: Storage Engine ( Chinese )

(2) Microsoft msdn:http://msdn.microsoft.com/zh-cn/library/ms190969 (v=sql.105). aspx

In-depth understanding of SQL Server file storage pages and Applications (GO)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.