Original link: www.sqlservercentral.com/articles/Stairway+Series/72351/
Clustered indexes:stairway to SQL Server Indexes level 3
by David Durant, 2013/01/25 (first PUBLISHED:2011/06/22)
The Series
This article is part of the ladder series: The Ladder for SQL Server indexing.
Indexes are the basis of database design and tell developers to use a large number of databases about the designer's intentions. Unfortunately, when performance issues arise, indexes are often added as an afterthought. This concludes with a series of simple articles that should allow database professionals to synchronize with them quickly.
The front level in this staircase outlines indicators for general and specialized nonclustered indexes. It summarizes the following key SQL Server indexing concepts. When a request arrives at a database, whether it is a SELECT statement or an INSERT, update, or DELETE statement, SQL Server has only three possible ways to access the data of the table referenced in the statement:
Access is simply a nonclustered index, avoiding access to tables. This is possible only if the index contains all the data for the table requested by the query.
Use the search key (s) to access the index, and then use the selected bookmark to access the individual rows of the table.
ignores the index and searches the table for the requested row.
This level starts with the third selection in the list above and searches for a table. In turn, this will lead us to discuss clustered indexes, topics mentioned at level 2nd, but not included.
Primary AdventureWorks database Table We will use the SalesOrderDetail table at this level. In 121317 rows, it is sufficient to illustrate some of the benefits of a clustered index on a table. Furthermore, there are two foreign keys, which are complex enough to illustrate some design decisions that you must make to the clustered index.
Sample Database
Although we have discussed the sample database at level 1th, we are still repeating it. Throughout the staircase, we will use examples to illustrate the concept. These examples are based on the Microsoft AdventureWorks sample database. We specialize in sales orders. Table Five will give us a good mix of transactional and non-transactional data, customers, salespeople, products, SalesOrderHeader, and SalesOrderDetail. To get things together, we use a subset of the columns. Because it is a standardized salesperson, information is broken down into three tables: sales staff, employees, and contacts.
Throughout the ladder, we use the following two terms: line item and order details, which are one line of interchangeable lines. The former is a more common business term; After the name of the AdventureWorks table appears.
A complete set of tables and their relationship with each other is shown in 1.
Clustered Indexes
We began to ask the question: how much work is needed to find a row (S) in the table if the nonclustered index is not used? Does searching the table for a request row mean scanning every row in an unordered table? Or a row in a SQL Server permanent sequence table that can quickly search for keywords to access them, just as it quickly accesses entries that search for critical nonclustered indexes? The answer depends on whether to instruct SQL Server to create a clustered index on the table.
With a nonclustered index, which is a detached object that occupies space, the clustered index of the table is the same. By creating a clustered index, you instruct SQL Server to sort the rows of the table into an index key sequence and maintain the sequence during future data modifications. The upcoming level will look at the generated internal data structures to complete this task. Now, however, the clustered index is considered to be a sorted table. Given a row of index key values, SQL Server has quick access to the row and can be accessed continuously from the row's table.
For demonstration purposes, we created two of this sample table, SalesOrderDetail; no indicator and one clustered index. For the key column of the index, our designer's adventureworksdatabase made the same choice: Salesorderid/salesorderdetailid. The code in Listing 1 copies the SalesOrderDetail table. We can rerun this code at any time, and we want to start with a "clean slate".
IF EXISTS (SELECT * from sys.tables&#= object_id ('dbo. Salesorderdetail_index'* from sys.tables&#= object_id (' dbo. Salesorderdetail_noindex')) DROP TABLE dbo. Salesorderdetail_noindex; GO
SELECT * into dbo. Salesorderdetail_index from Sales.SalesOrderDetail;
SELECT * into dbo. Salesorderdetail_noindex from Sales.SalesOrderDetail;
GO
CREATE CLUSTERED INDEX Ix_salesorderdetail
ON dbo. Salesorderdetail_index (SalesOrderID, Salesorderdetailid)
GO
Listing 1:create copies of the SalesOrderDetail table
So, suppose the SalesOrderDetail table looks like this in creating a clustered index:
After creating the clustered index shown above, the resulting table/clustered index will be as follows:
Looking at the sample data above, you will find that each salesorderdetailid value is unique. do is confused; Salesorderdetailid is not the primary key of the table. The Salesorderdetailid SalesOrderID/combination is the primary key of the table, and the index key of the clustered index.
Understanding the Basics of Clustered Indexes
Each table can have a maximum of one clustered index. The rows of a table can only be in one sequence. You need to decide what order (if any) is best for each table and, if possible, create a clustered index before the table is full of data. When making this decision, remember that sorting not only means ordering, but also grouping, such as grouping line items by sales order.
This is why the Adventureworksdatabase designer chooses the order of the SalesOrderDetail table in the SalesOrderID salesorderdetailid, which is the natural order of the goods.
For example, if a user requests a line item for an order, they typically request all line items for that order. A typical sales order form tells us that the printed copy of the order always contains all the line items. The nature of the sales order business is to group the line items by sales order. It is possible to view the items from the warehouse rather than the sales order line, but most of the requests, such as printing invoices from a salesperson or customer, or the program, or a query, calculate the total value of each order; All line items are required for any given sales order.
However, the user requirements themselves do not determine what is the best clustered index. The future levels in this series will cover the interior of the index, because some internal aspects of the index also affect the clustered index columns that you select.
Heaps
If there is no clustered index on the table, the table is called a heap. Each table is either a heap or a clustered index. So, although we often think of each indicator as one of two types, aggregation or non-aggregation, it is important to note that each table is divided into two types, which is a clustered index or a heap. Developers often say that the table "has" or "does not" have a clustered index, but that the table "yes" or "no" is a more meaningful clustered index.
There is a heap of SQL Server searches when looking for rows with only one method (not including the use of nonclustered indexes), which is started in the first row in the table and the table is made until all rows are read. No sequence, no search key, no quick navigation to a specific line.
Comparing a Clustered Index with a Heap
Evaluate a clustered index and a heap of performance, making two copies of listing 1 salesorderdetailtable. One is the heap version, on the other hand, we create the same clustered index, that is, the original table (Salesorderid,salesorderdetailid). Tables that have both nonclustered indexes.
We will execute the same three queries for each version of the table, one to retrieve a single row, one to retrieve all rows for one order, and another to query all rows for a single product. We show the SQL and the results of each execution in the table below.
Our first query retrieves a row, and the execution details are shown in table 1.
Our second query retrieves all the lines of a single sales order, and you can see the execution details in table 2.
Our third query retrieves all the rows for a single product, and the results are shown in table 3.
Our first query greatly benefited from the existence of a clustered index, and the third was roughly equal. is a clustered index harmful? The answer is yes, it's mainly about inserting, updating, and deleting rows. Like many other aspects of an index encountered in the early stages, it is also a topic that is discussed in more detail at a higher level.
In general, the benefit of retrieval is greater than the harm of maintenance; Make a clustered index a heap. If you create a table in an Azure database, there is no choice; Each table must be a clustered index.
Conclusion
A clustered index is a sort table that is specified by you when the index is created and maintained by SQL Server. Any row in the table can quickly access its key value. Any set of rows in the index key sequence can also be accessed quickly within the scope of the key.
There can be only one clustered index per table. The decision about which columns should be clustered index key columns is the most important index decision you make for any table.
In our 4th level, we will focus from logic to physics, introducing pages and scopes, and examining the physical structure of the index.
Downloadable Code
Translation (ix)--clustered Indexes:stairway to SQL Server Indexes level 3