Do you understand the indexing principles of SQL?

Source: Internet
Author: User
The previous article roughly summarized the differences between SQL clustered indexes and non-clustered indexes, But it seems unclear. In this article, I will analyze the index principles again. An index exists for retrieval, which means that an index is not required by a table. Table indexes are composed of multiple pages. These pages form a tree structure, that is, we usually call B-tree. First

The previous article roughly summarized the differences between SQL clustered indexes and non-clustered indexes, But it seems unclear. In this article, I will analyze the index principles again. An index exists for retrieval, which means that an index is not required by a table. Table indexes are composed of multiple pages. These pages form a tree structure, that is, we usually call B-tree. First

The previous article roughly summarized the differences between SQL clustered indexes and non-clustered indexes, But it seems unclear. In this article, I will analyze the index principles again.

An index exists for retrieval, which means that an index is not required by a table. Table indexes are composed of multiple pages. These pages form a tree structure, that is, the B tree. First, let's look at the components of the following table indexes:

Root node, which points to the other two pages and logically divides the records of a table into Non-Leaf nodes Non-Leaf Level (branches ), it points to a smaller Leaf node Leaf Level (Leaf ). The root node, non-leaf node, and leaf node are all on the index page, collectively referred to as the index leaf node, which belongs to the category of the index page. These "branches" and "leaves" finally point to the data Page. A leaf between a root node and a leaf node is also called a data intermediate page. The Root node corresponds to the Root field of the sysindexes table and records the physical locations (that is, pointers) of non-leaf nodes. The non-leaf nodes are located between the Root node and the leaf node, records the pointer to the leaf node, and the leaf node finally points to the data page, which is the final B tree.

How does a database access table data:

First, no index is created for the table.

This type of table is called a heap table. Because all data pages are unordered and disorganized, You need to query data by record. Sometimes the first record can be found, the worst case is to find the data in the last record, but do not think that the SQL will return immediately as a result after the data is found. Even if the SQL finds the record, it will also traverse all the data once, which can be learned from the final execution plan, that is, the table scan, which can also be queried for tables without indexes, that is, the efficiency will be very low, if the data volume is small.

Q: How does SQL know that the table has no index?

When receiving a query request, SQL analyzes the value of the Index flag (INDID: Index ID) FIELD IN THE sysindexes table. If the value is 0, it indicates that this is a data table rather than an Index table, SQL uses the FirstIAM value of another field in the sysindexes table to find the IAM page chain of the table, which is the set of all data pages. You can search for IAM on the Internet.

Second, access the table that creates a non-clustered index.

Multiple non-clustered indexes can be created to form a B-tree structure. Leaf-level nodes do not contain data pages, but only index rows. If a table has only non-clustered indexes, each index ROW contains the key value of the non-clustered index and the ROW identifier (RID), which points to the data ROW with the key value. The RID consists of the file ID, page number, and number of the row on the page. When the value of INDID is between 2 and, it indicates that the table has a non-clustered index page. The value of the ROOT field called by SQL points to the ROOT of the non-clustered index B tree and finds the closest value to the queried value. Based on this value, the page number in the non-leaf node is found, find the RID Of the value on the page corresponding to the leaf node, locate the page and row in the Heap based on the RID, and return it to the query end.

In the previous article, a non-clustered index is created on cityid. When you execute Select * From student Where cityid = '000000', the query process is:

  1. If the indid value of the sysindexes table is 2, a non-clustered index exists;
  2. Starting from the root node, locate the nearest 0101 value (the branch node) in the non-leaf node and find the n page on the leaf level page;
  3. Search for the 0101 RID under page n on the leaf level page. The RID is displayed, indicates that the record named 0101 in the cityid field is located in row j on page I of the heap, and N indicates the ID value of the file.
  4. The record is returned to the client on page j of heap I.

Third, access the table that creates a clustered index.

In a clustered index, the data page of the data is at the leaf level, and the index page of the index data is at the non-leaf level. The principle is similar to that of the preceding non-clustered index query, because the records are sorted by the clustered index key value, that is, the clustered index key value is the specific data page. This is much easier than non-clustered indexes, because a node query is less than a non-clustered index.

A clustered index is created on the username field of the previous article. When Select * From student Where username = '1' is executed, the query process is:

  1. If the value of indid in the sysindexes table is 1, a clustered cable is created;
  2. Starting from the root node, locate the value closest to 1 in a non-leaf node, and find the n page on the leaf level page;
  3. Search for entries with a value of 1 on page n of the leaf level page, which is the data record itself;
  4. Return the record to the client.

For reference:

Fourth, how to access data tables with both clustered indexes and non-clustered indexes:

A clustered index is created for the username field and a non-clustered index is created for the cityid field. When Select * From student Where cityid = '2016' is executed, the query process is:

  1. If the indid value of the sysindexes table is 2, a non-clustered index exists;
  2. Starting from the root, the nearest 0101 entries are located in the non-leaf nodes of the cityid non-clustered index;
  3. The logic position of 0101 is found on the leaf level page of the noodle shop, which is a pointer to clustered indexes;
  4. Based on the indicated position of the pointer, go to the leaf-level page in the clustered index of username to find 0101 data records;
  5. Return the record to the client.

Based on the principle of accessing the index from the above database, we can easily explain the difference between clustered index and non-clustered index. The principle is the same. The key is what index is used in what scenarios, in the next article, I will summarize some of the indexes that are most suitable for different scenarios.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.