Original article address:
Stairway to SQL Server Indexes: Level 2, Deeper into Nonclustered Indexes
This article is part of the SQL Server Index advanced series Stairway to SQL Server Indexes.
The first level introduces non-clustered indexes in SQL Server. In addition, in the first learning example, we have demonstrated the potential benefits of indexing when a row of data is obtained from a table. At this level, we will continue to introduce non-clustered indexes to see their contributions in improving query performance.
Let's first introduce some theories, learn some internal information about indexes, help us explain the theories, and then execute some queries. These queries are executed in two cases: including and without indexes. When performance reports are enabled, we can see the impact of indexes.
We continue to use some tables in the AdventureWorks database, mainly in the Contact table. We will just use an index and use the FullName index at the upper level to prove our point of view. To ensure that we have good control over the index of the Contact table, we will make two copies, one for creating the FullName index and the other for not creating the index.
- IF EXISTS (
-
- SELECT *
-
- FROM sys.tables
-
- WHERE OBJECT_ID = OBJECT_ID('dbo.Contacts_index'))
- DROP TABLE dbo.Contacts_index;
- GO
- IF EXISTS (
-
- SELECT *
-
- FROM sys.tables
-
- WHERE OBJECT_ID = OBJECT_ID('dbo.Contacts_noindex'))
-
- DROP TABLE dbo.Contacts_noindex;
- GO
- SELECT * INTO dbo.Contacts_index
-
- FROM Person.Contact;
- SELECT * INTO dbo.Contacts_noindex
-
- FROM Person.Contact;
Non-clustered Index
Create a non-clustered index in the Contacts_index table
- CREATE INDEX FullName
-
- ON Contacts_index
-
- ( LastName, FirstName );
Keep in mind that the index key is stored in a non-clustered index sequence and the real data in the table is accessed through tags. You can regard a tag as a pointer. In the future, the tag format, usage, and details will be described.
In addition, the SQL Server non-clustered index entry also contains some internal headers and optional data values. These will be introduced in subsequent articles and are not the key content.
So far, we only need to know that the key enables SQL Server to find the appropriate index entry, and the Entry Label allows SQL Server to access the row data of the table.
Benefits of orderly indexing entries
The index entries are ordered, so SQL Server can quickly locate the entries. Scanning can start from the header, from the tail, or from the middle.
Therefore, for a query, request all Contact users whose LastName starts with S where lastname like's % '). SQL Server quickly locates the records starting with the first S, and then uses the index to access data rows with tags until the records starting with the first T.
If the selected columns are included in the index, the above query will be executed faster. If we execute
- SELECT FirstName, LastName
-
- FROM Contact
-
- WHERE LastName LIKE 'S%';
SQL Server quickly navigate to the S entry, and then through the index, ignore the tag, directly return data from the index entry until the first T entry. In relational database terms, it is called full-coverage index query.
Many SQL operations can benefit from indexes, including ORDER BY, GROUP BY, DISTINCT, UNION (not UNION ALL), JOIN... ON.
Remember the importance of the key order from left to right. The index we created is very useful for lastname = "ashton", but it does not even work for firstname = "ashton.
Test some simple queries
If you want to execute the following query, make sure that you have executed the preceding script, created the contact_index and contact_noindex tables, and created the LastName and FirstName indexes in the contact_index table.
Enable statistics
- SET STATISTICS io ON
-
- SET STATISTICS time ON
Since there are only 19972 rows of data in the contact table, it is difficult to get a meaningful statistical time. Most Queries show CPU time: 0 ms, so we can disable time statistics and only display io statistics. If you need a large table to calculate the real time information, you can use the script later in the article to build a contact table with millions of rows of data. The following tests all take the tables of 19972 rows as test objects.
Test a fully-covered Query
The first query is a query that overwrites the index. It obtains some columns of records whose lastname starts with S in the contact table. The following is the execution information.
SQL statement |
SELECT FirstName, LastName FROM dbo. Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE's %' |
If no index is available |
(2130 row (s) affected) Table 'contacts _ noindex'. Scan count 1, logical reads 568. |
Indexing |
(2130 row (s) affected) Table 'contacts _ Index'. Scan count 1, logical reads 14. |
Index impact |
IO reduced from 568 to 14 |
Note |
Covering the query index is a good thing. If no index exists, a full table scan is performed. Row 2130 indicates that records starting with S account for 10% of the data. |
Test A non-fully covered Query
Let's modify the query to make the same query, but the obtained column contains some columns without index creation. The following is the execution result.
SQL statement |
SELECT * FROM dbo. Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE's %' |
If no index is available |
(2130 row (s) affected) Table 'contacts _ noindex'. Scan count 1, logical reads 568. |
Indexing |
(2130 row (s) affected) Table 'contacts _ Index'. Scan count 1, logical read 568. |
Index impact |
IO is not affected |
Note |
The index is not used during the query. In this case, SQL Server feels that using index search requires more work than directly scanning indexes. |
Test A non-fully covered query, but more conditions are provided.
We can modify the query to make it the same, but only to reduce the scope of the query results and increase the benefit of using the index. The following is the execution result.
SQL statement |
SELECT * FROM dbo. Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'ste %' |
If no index is available |
(107 row (s) affected) Table 'contacts _ noindex'. Scan count 1, logical reads 568. |
Indexing |
(107 row (s) affected) Table 'Contact _ Index'. Scan count 1, logical reads 111. |
Index impact |
IO is reduced from 568 to 111. |
Note |
SQL Server accesses 107 entries, all of which are within the continuous range of indexes. Each entry tag is used to obtain the corresponding row data. These rows are not consecutive in the table. These queries use indexes, but they are not as effective as the first overwriting query, especially in IO reading. You want to read 107 indexes and then get 107 data records, resulting in 107 reads. In the previous query, 2130 rows of data were requested and no indexes were used. In this request, 107 rows of data are indexed. Do you know the critical point of using indexes? At a later level, we will introduce this aspect. |
Test A fully covered Aggregate Query
The last example is an aggregate query that includes count calculation.
SQL statement |
SELECT LastName, FirstName, COUNT (*) as 'contacts' FROM dbo. Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'ste %' Group by LastName, FirstName |
If no index is available |
(2130 row (s) affected) Table 'contacts _ noindex'. Scan count 1, logical reads 568. |
Indexing |
(104 row (s) affected) Table 'contacts _ Index'. Scan count 1, logical reads 4. |
Index impact |
IO reduced from 568 to 4 |
Note |
All the information to be queried is included in the index, which makes good use of the index. |
Test A non-fully covered Aggregate Query
Let's modify the query to make the same query, but the obtained column contains some columns without index creation. The following is the execution result.
SQL statement |
SELECT LastName, FirstName, MiddleName, COUNT (*) as 'contacts' FROM dbo. Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'ste %' Group by LastName, FirstName, MiddleName |
If no index is available |
(2130 row (s) affected) Table 'contacts _ noindex'. Scan count 1, logical reads 568. |
Indexing |
(105 row (s) affected) Table 'contactlarge'. Scan count 1, logical reads 111. |
Index impact |
IO reduced from 568 to 111 |
Note |
. |
Conclusion
When the current position is located, we know that non-clustered indexes have the following benefits:
- Is a set of ordered entries.
- Each row in the table has an entry.
- Contains an index key and a tag.
- The user is responsible for creating.
- SQL Server.
- SQL Server is used to minimize query attempts to satisfy customers' requests.
Through some examples, we can see that the SQL Server can meet the query request through the index, some queries will completely ignore the index, and some will partially use the index. For this reason, we will update a conclusion in level 1.
When a request arrives at the database, SQL Server only has three data access methods:
Generally, the first is ideal, and the second is better than the third. In the previous introduction, we learned how to increase the possibility of using indexes and how to determine how to use non-clustered indexes more efficiently. But we need to know more details about the index.
Before we know the details, we also need to introduce other types of SQL Server indexes: clustered indexes.
Code download
- NonClustered. SQL
- Level 2-NonClustered. SQL
- Level2_MillionRowContactTable. SQL