Basic SQL Server Index knowledge (2) -- clustered index, non-clustered index from Jun. Net)

Source: Internet
Author: User

As I needed to train my colleagues on database indexing knowledge, I collected and sorted out this series of blogs. This is a summary of index knowledge. After summing up, I found that many vague concepts were much clearer in the past.

Both clustered indexes and non-clustered indexes are implemented using the B + tree. Before learning about these two indexes, we need to understand the B + tree first. If you do not know about Tree B, you are advised to refer to the following articles:

What are btree, B-tree, B + tree, and B * tree?
Http://blog.csdn.net/manesking/archive/2007/02/09/1505979.aspx

Structure of the B + tree:

Features of the B + tree:

  • All the keywords appear in the linked list of the leaf node (dense index), and the keywords in the linked list are exactly ordered;
  • It is impossible to hit non-leaf nodes;
  • Non-leaf nodes are equivalent to leaf node indexes (sparse indexes), and leaf nodes are equivalent to data layers that store (keywords) data;

Adding or deleting a data in the B + tree requires processing in multiple situations. This is complicated and will not be detailed here.

Clustered Index)

  • The leaf node of the clustered index is the actual data page.
  • On the data page, data is stored in index order.
  • The physical location of the row is the same as that of the row in the index.
  • Each table can have only one clustered index.
  • The average size of clustered indexes is about 5% of the table size.

The following are two simple descriptions of clustered indexes:

Execute the following statement in the clustered index:

select * from table where firstName = 'Ota'

 

The clustered index graph of a relatively abstract point:

Unclustered Index)

  • A non-clustered index page is not a data page, but a page pointing to a data page.
  • If the index type is not specified, non-clustered indexes are used by default.
  • The order of leaf node pages is different from the physical storage order of tables.
  • Each table can have a maximum of 249 non-clustered indexes.
  • Create a clustered index before creating a non-clustered index (otherwise, index reconstruction is triggered)

Execute the following statement in a non-clustered index:

select * from employee where lname = 'Green'

A non-clustered index graph of a relatively abstract point:

 

What is bookmark lookup?

Although bookmark lookup is no longer mentioned in SQL 2005 (changing the changes), many of our searches use such a search process as follows:
Search in non-clustering and then in clustered index.

InHttp://www.sqlskills.com/In the example provided, we can see that bookmark lookup is slower than table scan. The example script is as follows:

USE CREDITgo-- These samples use the Credit database. You can download and restore the-- credit database from here:-- http://www.sqlskills.com/resources/conferences/CreditBackup80.zip-- NOTE: This is a SQL Server 2000 backup and MANY examples will work on -- SQL Server 2000 in addition to SQL Server 2005.--------------------------------------------------------------------------------- (1) Create two tables which are copies of charge:--------------------------------------------------------------------------------- Create the HEAPSELECT * INTO ChargeHeap FROM Chargego-- Create the CL TableSELECT * INTO ChargeCL FROM ChargegoCREATE CLUSTERED INDEX ChargeCL_CLInd ON ChargeCL (member_no, charge_no)go--------------------------------------------------------------------------------- (2) Add the same non-clustered indexes to BOTH of these tables:--------------------------------------------------------------------------------- Create the NC index on the HEAPCREATE INDEX ChargeHeap_NCInd ON ChargeHeap (Charge_no)go-- Create the NC index on the CL TableCREATE INDEX ChargeCL_NCInd ON ChargeCL (Charge_no)go--------------------------------------------------------------------------------- (3) Begin to query these tables and see what kind of access and I/O returns--------------------------------------------------------------------------------- Get ready for a bit of analysis:SET STATISTICS IO ON-- Turn Graphical Showplan ON (Ctrl+K)-- First, a point query (also, see how a bookmark lookup looks in 2005)SELECT * FROM ChargeHeap WHERE Charge_no = 12345goSELECT * FROM ChargeCL WHERE Charge_no = 12345go-- What if our query is less selective?-- 1000 is .0625% of our data... (1,600,000 million rows)SELECT * FROM ChargeHeap WHERE Charge_no < 1000goSELECT * FROM ChargeCL WHERE Charge_no < 1000go-- What if our query is less selective?-- 16000 is 1% of our data... (1,600,000 million rows)SELECT * FROM ChargeHeap WHERE Charge_no < 16000goSELECT * FROM ChargeCL WHERE Charge_no < 16000go--------------------------------------------------------------------------------- (4) What's the EXACT percentage where the bookmark lookup isn't worth it?--------------------------------------------------------------------------------- What happens here: Table Scan or Bookmark lookup?SELECT * FROM ChargeHeap WHERE Charge_no < 4000goSELECT * FROM ChargeCL WHERE Charge_no < 4000go-- What happens here: Table Scan or Bookmark lookup?SELECT * FROM ChargeHeap WHERE Charge_no < 3000goSELECT * FROM ChargeCL WHERE Charge_no < 3000go-- And - you can narrow it down by trying the middle ground:-- What happens here: Table Scan or Bookmark lookup?SELECT * FROM ChargeHeap WHERE Charge_no < 3500goSELECT * FROM ChargeCL WHERE Charge_no < 3500go-- And again:SELECT * FROM ChargeHeap WHERE Charge_no < 3250goSELECT * FROM ChargeCL WHERE Charge_no < 3250go-- And again:SELECT * FROM ChargeHeap WHERE Charge_no < 3375goSELECT * FROM ChargeCL WHERE Charge_no < 3375go-- Don't worry, I won't make you go through it all :)-- For the Heap Table (in THIS case), the cutoff is: 0.21%SELECT * FROM ChargeHeap  WHERE Charge_no < 3383goSELECT * FROM ChargeHeap WHERE Charge_no < 3384go-- For the Clustered Table (in THIS case), the cut-off is: 0.21%SELECT * FROM ChargeCL WHERE Charge_no < 3438SELECT * FROM ChargeCL WHERE Charge_no < 3439go

This example is the demo of Wu jiazhen at teched 2007.

Summary:

This blog simply uses several charts to introduce the index implementation methods: B + number, clustered index, non-clustered index, and bookmark lookup information.

References:

Table organization and index organization
Http://technet.microsoft.com/zh-cn/library/ms189051.aspx
Http://technet.microsoft.com/en-us/library/ms189051.aspx

How indexes work
Http://manuals.sybase.com/onlinebooks/group-asarc/asg1200e/aseperf/@Generic__BookTextView/3358

Bookmark Lookup
Http://blogs.msdn.com/craigfr/archive/2006/06/30/652639.aspx

Logical and physical operators reference
Http://msdn2.microsoft.com/en-us/library/ms191158.aspx

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.