Description of pseudo columns and pseudo columns in the SQL Server database.

Source: Internet
Author: User

Description of pseudo columns and pseudo columns in the SQL Server database.

Pseudo columns in SQL Server

In the afternoon, someone in the QQ group was discussing (non-clustered) index storage. For clustered index tables, the non-clustered index storage includes the index key value + clustered index key value; for non-clustered index tables, the index stores the index key value + RowId, which should be a common sense and will not be elaborated in detail.

The RowId mentioned here is a bit of thought.

So what is this RowId? Can we look at RowId information more intuitively? What does it mean? Of course, this is also possible.

The table in Oracle has a pseudo-column concept, that is, when querying the table, addselect rowid,* from Table, The pseudo column is queried.

SQL Server also has such a pseudo column. in SQL Server, this pseudo column can be considered as the physical address of the Data row. Let's take a look at the meanings of this RowId and RowId.

Testing of pseudo Columns

Create a simple table. Use this table to view the description of pseudo columns.

CREATE TABLE Test( id int identity(1,1), name varchar(50))GOINSERT INTO Test VALUES (NEWID())GO 100

SQL Server has an undisclosed pseudo column "% physloc %", that is, this field can be added to any table during query. For example, you can check the pseudo columns of each row in the table.

The type of this pseudo column isbinary(8)That is, there are 8 bytes, referDATALENGTH(%%physloc%%) as Len, % Physloc % the physical address of the record returned. The first four bytes represent the page number, the middle two bytes represent the file number, and the last two bytes represent the slot number.
To more easily observe the meaning of pseudo columns, sqlserver provides an undisclosed system function sys. fn_PhysLocFormatter.sys.fn_PhysLocFormatterThis function continues to observe this pseudo column.

For example, here we can see the information in the pseudo column clearly.

For example, in the first line (1: 73: 0), the first four bytes represent the page number, the two middle bytes represent the file number, and the last two bytes represent the slot number, (1: 73: 0)sys.fn_PhysLocFormatterFormat the explicit result.

Put file number 1 at the beginning, 73 in the middle is the page number, and the last 0 is the sloc number ).

The following is a rough description of the meanings of these fields. Here, we only need to have a basic understanding of the storage of SQL Server. Otherwise, we will see the cloud in the fog.

1. What is the file number first?

For example, the file number is the data file number of the database. There is only one data file, and the file number is 1. The default value is used for table creation) built on a file with fileid = 1, the file with fileid = 2 is a log file.

2. The second is the page number. The page number is the page number allocated to the data page of the current table (minimum allocation unit of 8 KB). Let's take a look at the page of the Test table.

The dbcc ind command is used to query the Page Information allocated to the table. The 77 page is the same as the IMA page.

Page 73 is the page where data is actually stored, just like 73 in the 1: 73: 0 above.

  

3. Finally, let's take a look at the slot number. The concept of slot number should have a basic understanding of the data page of SQL Server. Here we steal a picture of a netizen.

The so-called slot number stores multiple rows of data on each page of the data page. It is used to mark the offset of each row of data, to put it bluntly, it is "the starting position of the address space for storing each row of Data", because the total length of each row is different (in the case of a Variable Length Column ), the storage space occupied by each row is also different. The slot number or row offset indicate the start position of each row in the page.

Howeversys.fn_PhysLocFormatterThe formatted explicit slot number is not the following offset, but the N information of the nth data row. Therefore, the slot number of the nth row is 1, the slot number of row 2nd is 2, and so on. When the first page is full, it is stored from the second page, and the slot number starts from 0 and accumulates.

  

  

So far, we have a simple understanding of the pseudo columns of SQL Server, that is to say, RowId.

It can be considered that in the SQL Server database, the pseudo-column RowId is the physical address of the Data row. It is not sure whether the pseudo-column (RowId) in other databases is a physical address (and probably also)

Here is a simple question:
Why does an SQL Server clustered table (with clustered indexes) store "index key value + clustered index key value" When storing data? For non-clustered index tables, the index stores the index key value + RowId?

Alternatively, why does the clustered index table store "index key value + clustered index key value" instead of "index key value + RowId"

As a common sense, clustered indexes should be stored in the order of clustered indexes. This means that the physical location of the row data in the clustered index table may change, for example, when the physical location of data rows changes in the well-known "page split, if the non-clustered index stores the index key value + RowId, then this RowId is bound to change. This change naturally consumes a certain amount of performance. To prevent this situation, the non-clustered index in the clustered table is stored as a relatively unchanged index key value + the clustered index key value, because when the physical location of the Data row changes, the clustered index key value remains unchanged, which is not hard to understand.

Of course, there is an exception. When the clustered index table is updated, the key value of the clustered index is directly updated, it may also change the physical location of the current data row in the clustered index table, which is also interesting and will not be described.

This is the same as a bypass code. It requires a basic understanding of clustered and non-clustered indexes in SQL Server and the storage structure.

Last high-energy warning

High-energy warning, not to mention my blind ratio misleading, the above-mentioned functions for parsing pseudo Columnssys.fn_PhysLocFormatterIt is an undisclosed function. An undisclosed function may have some potential problems. In fact, this function has a very serious bug.

This bug is a logical error when parsing the physical storage location. This problem has been carefully analyzed by people who have already done so.

Reference: http://www.bkjia.com/article/124109.htm

Currently, the test shows that there is still a bug in SQL Server 2014. N years ago, I learned that there was such a function, but I never wanted to mention it.sys.fn_PhysLocFormatterFor the reason of this function, do not perform a verification test on undisclosed functions, and declare again: this function has bugs. Please use it with caution.

Attach the source code of this function and refer to the conclusion in the original article.

create function sys.fn_PhysLocFormatter (@physical_locator binary (8)) returns varchar (128)as begin declare @page_id binary (4) declare @file_id binary (2) declare @slot_id binary (2) -- Page ID is the first four bytes, then 2 bytes of page ID, then 2 bytes of slot -- select @page_id = convert (binary (4), reverse (substring (@physical_locator, 1, 4))) select @file_id = convert (binary (2), reverse (substring (@physical_locator, 5, 2))) select @slot_id = convert (binary (2), reverse (substring (@physical_locator, 7, 2))) return '(' + cast (cast (@file_id as int) as varchar) + ':' + cast (cast (@page_id as int) as varchar) + ':' + cast (cast (@slot_id as int) as varchar) + ')' end

The problem lies in the reverse function.

The reverse function is used to reverse the character rather than the byte inversion. When a byte between 8-Fe is encountered, it is considered as a dual-byte character and is combined to participate in the inversion operation, resulting in an error.

Summary

This article briefly describes the pseudo columns in SQL Server and their meanings. It provides a simple understanding of non-clustered indexes and the storage structure of data rows through pseudo columns.

Well, the above is all the content of this article. I hope the content of this article will help you in your study or work. If you have any questions, you can leave a message, thank you for your support.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.