Basic SQL Server Index knowledge (2) ---- clustered index, non-clustered Index
[From] http://blog.joycode.com/ghj/archive/2008/01/02/113291.aspx
As I needed to train my colleagues on database indexing knowledge, I collected and sorted out this series of blogs. This is a summary of index knowledge. After summing up, I found that many vague concepts were much clearer in the past.
Both clustered indexes and non-clustered indexes are implemented using the B + tree. Before learning about these two indexes, we need to understand the B + tree first. If you do not know about Tree B, we recommend that you refer to the following articles.Article:
What are btree, B-tree, B + tree, and B * tree?
Http://blog.csdn.net/manesking/archive/2007/02/09/1505979.aspx
Structure of the B + tree:
Features of the B + tree:
- All the keywords appear in the linked list of the leaf node (dense index), and the keywords in the linked list are exactly ordered;
- It is impossible to hit non-leaf nodes;
- Non-leaf nodes are equivalent to leaf node indexes (sparse indexes), and leaf nodes are equivalent to data layers that store (keywords) data;
Adding or deleting a data in the B + tree requires processing in multiple situations. This is complicated and will not be detailed here.
Clustered Index)
- The leaf node of the clustered index is the actual data page.
- On the data page, data is stored in index order.
- The physical location of the row is the same as that of the row in the index.
- Each table can have only one clustered index.
- The average size of clustered indexes is about 5% of the table size.
The following are two simple descriptions of clustered indexes:
Execute the following statement in the clustered index:
Select*From Table WhereFirstname ='OTA'
The clustered index graph of a relatively abstract point:
Unclustered Index)
- A non-clustered index page is not a data page, but a page pointing to a data page.
- If the index type is not specified, non-clustered indexes are used by default.
- The order of leaf node pages is different from the physical storage order of tables.
- Each table can have a maximum of 249 non-clustered indexes.
- Create a clustered index before creating a non-clustered index (otherwise, index reconstruction is triggered)
Execute the following statement in a non-clustered index:
Select*FromEmployeeWhereLname ='Green'
A non-clustered index graph of a relatively abstract point:
What is bookmark lookup?
Although bookmark lookup is no longer mentioned in SQL 2005 (changing the changes), many of our searches use such a search process as follows:
Search in non-clustering and then in clustered index.
In an example provided by the http://www.sqlskills.com/, we demonstrated that bookmark lookup is slower than table scan, and the example script is as follows:
Use Credit Go -- These samples use the credit database. You can download and restore -- Credit database from here: Http://www.sqlskills.com/resources/conferences/CreditBackup80.zip -- Note: This is a SQL Server 2000 backup and merge examples will work on -- SQL Server 2000 in addition to SQL Server 2005. ------------------------------------------------------------------------------- -- (1) create two tables which are copies of charge: ------------------------------------------------------------------------------- -- Create the heap Select * Into Chargeheap From ChargeGo -- Create the CL table Select * Into Chargecl From Charge Go Create Clustered Index Chargecl_clind On Chargecl (member_no, charge_no) Go ------------------------------------------------------------------------------- -- (2) Add the same non-clustered indexes to both of these tables: ------------------------------------------------------------------------------- -- Create the NC index on the heap Create Index Chargeheap_ncind On Chargeheap (charge_no) Go -- Create the NC index on the CL table Create Index Chargecl_ncind On Chargecl (charge_no) Go ------------------------------------------------------------------------------- -- (3) begin to query these tables and see what kind of access and I/O returns ------------------------------------------------------------------------------- -- Get ready for a bit of analysis: Set Statistics Io On -- Turn graphical showplan on (CTRL + k) -- First, a point query (also, see how a bookmark lookup looks in 2005) Select * From Chargeheap Where Charge_no= 12345Go Select * From Chargecl Where Charge_no= 12345 Go -- What if our query is less selective? -- 1000 is. 0625% of our data... (1,600,000 million rows) Select * From Chargeheap Where Charge_no <1, 1000 Go Select * From ChargeclWhere Charge_no <1, 1000 Go -- What if our query is less selective? -- 16000 is 1% of our data... (1,600,000 million rows) Select * From Chargeheap Where Charge_no <1, 16000 Go Select * From Chargecl Where Charge_no <1, 16000 Go ------------------------------------------------------------------------------- -- (4) What's the exact percentage where the bookmark lookup isn't worth it? ------------------------------------------------------------------------------- -- What happens here: Table scan or bookmark lookup? Select * From Chargeheap Where Charge_no <1, 4000 Go Select * From Chargecl Where Charge_no <1, 4000 Go -- What happens here: Table scan or bookmark lookup? Select * From Chargeheap Where Charge_no <1, 3000 Go Select * From Chargecl Where Charge_no <1, 3000 Go -- And-you can narrow it down by trying the middle ground: -- What happens here: Table scan or bookmark lookup? Select * From Chargeheap Where Charge_no <1, 3500 Go Select * From Chargecl Where Charge_no <1, 3500 Go -- And again: Select * From Chargeheap Where Charge_no <1, 3250 Go Select * From Chargecl Where Charge_no <1, 3250 Go -- And again: Select * From Chargeheap Where Charge_no <1, 3375 Go Select * From Chargecl Where Charge_no <1, 3375 Go -- Don't worry, I won't make you go through it all :) -- For the heap table (in this case), the cutoff is: 0.21% Select * From Chargeheap Where Charge_no <1, 3383 Go Select * From Chargeheap Where Charge_no <1, 3384 Go -- For the clustered table (in this case), the cut-off is: 0.21% Select *From Chargecl Where Charge_no <1, 3438 Select * From Chargecl Where Charge_no <1, 3439 Go
This example is the demo of Wu jiazhen at teched 2007.
Summary:
This blog simply uses several charts to introduce the index implementation methods: B + number, clustered index, non-clustered index, and bookmark lookup information.
References:
Table organization and index organization
Http://technet.microsoft.com/zh-cn/library/ms189051.aspx
Http://technet.microsoft.com/en-us/library/ms189051.aspx
How indexes work
Http://manuals.sybase.com/onlinebooks/group-asarc/asg1200e/aseperf/@Generic__BookTextView/3358
Bookmark Lookup
Http://blogs.msdn.com/craigfr/archive/2006/06/30/652639.aspx
Logical and physical operators reference
Http://msdn2.microsoft.com/en-us/library/ms191158.aspx