In SQL Server, a nonclustered index can actually be seen as a table with a clustered index, but relative to the actual table, the number of columns stored in a nonclustered index is much less, typically an indexed column, a clustered key (or RID). A nonclustered index contains only the columns of the nonclustered indexes in the source table and pointers to the actual physical tables.
I. Include of nonclustered indexes
A nonclustered index can actually be seen as a list with a clustered index, and when the nonclustered index contains all the information needed for a query, it is no longer necessary to check the base table, and just do the nonclustered index to get the data you need. include can actually also be called overwriting an index, but it does not affect the size of the index key.
Let's take a look at the following table:
This table is about 150,000 data. The clustered index column is the ID, and we first set up a nonclustered index in the Name column.
CREATE nonclustered INDEX index_name on person (Name)
Then execute the query:
SELECT name,age from person where Name = ' Olin '
The implementation plan is as follows:
The above procedure is to scan the nonclustered index columns first, locate the clustered index, and then navigate to the data through the clustered index.
Now we'll remove the index and build another one.
Drop INDEX Person.index_name --delete nonclustered index index_name CREATE nonclustered --once again, this time we include the Age column index Index_ Name on person (name) INCLUDE (age)
Now let's take a look at the execution plan for the query that we just made:
Because the Age column is also included in the nonclustered index index_name, this time you will be able to get all the data you need simply by looking up a nonclustered index. You do not need to scan the clustered index again. Obviously this query is faster than just now.
Note that the include-in column is not used as an index and can be scanned as an index, just an indexed column .
The include is best used in the following scenarios:
- You don't want to increase the size of the index key, but you still want to have an overlay index;
- You are going to index a data type that cannot be indexed (except text, ntext, and images);
- You have exceeded the maximum number of keyword columns for an index (but it is best to avoid this problem);
Second, the coverage of non-clustered index
Index overlay refers to the fact that an index is established so that the-sql query does not have to reach the base table just by looking through the index to get the required data.
If the query encounters an index and does not need to reference the data table at all to get the required data, then this index can be called the Overwrite index. Overriding an index is a useful technique for reducing the logical reading of queries.
The following deletes the previously created index, in order to see an overlay of the index.
CREATE nonclustered INDEX index_name on person (name,age) SELECT name,age from person WHERE NAME = ' Olin '
Look at the execution plan:
As you can see, the results are obtained by simply looking up a nonclustered index. Very fast.
What is the difference between overlay and previous include? We changed the search condition to age.
Overwrite index:
INCLUDE:
Note that the include is a clustered table scan, and the overlay index still uses the nonclustered index to find the results.
Therefore, it can be concluded that the include column cannot be used as an index key.
To take advantage of the overwrite index, note that the list of SELECT statements should be kept small by using fewer columns to keep the size of the overlay index, and the columns added with the Include statement make sense.
Before you build many coverage indexes, consider how SQL Server effectively and automatically uses index intersections to create an overlay index for queries on the fly.
Third, the intersection of nonclustered indexes
If a table has multiple indexes, SQL Server can use multiple indexes to execute a query. SQL Server can take advantage of multiple indexes, select a small subset of data on a per-index basis, and then perform a crossover of two subsets (that is, only those rows that meet all criteria are returned). SQL Server can develop multiple indexes on a single table and then use an algorithm to cross the two subsets (which can be understood as intersection).
Let's remove the previously established index and create a new:
The essence of a nonclustered index is a table, which makes it possible for a table-like join between several nonclustered indexes by additional tables, allowing a join between nonclustered indexes to provide the query optimizer with the required data without accessing the base table.
To improve the performance of a query, SQL Server can use multiple indexes on a table. Therefore, consider creating multiple narrow indexes instead of wide index keys. SQL Server can use them together when needed, and queries can benefit from narrow indexes when they are not needed. When you create an overlay index, you need to determine whether the width of the index is acceptable and whether you can complete the task with the Include column. If not, determine the existing nonclustered index that contains most of the columns needed to overwrite the index. If possible, rearrange the column order of existing nonclustered indexes appropriately, allowing the optimizer to consider an index crossover between two nonclustered indexes.
Sometimes, you might have to create a separate nonclustered index for a reason:
- Rearranging columns in an existing index is not allowed;
- Some of the columns required to overwrite the index cannot be included in an existing nonclustered index;
- Two the total number of columns in an existing nonclustered index may be more than the number of columns required to overwrite the index;
In these cases, you can create a nonclustered index on the remaining columns. The optimizer will be able to use an index crossover if the new index conforms to the requirements of an existing index that conforms to the overwrite index. When you determine the columns and their order for the new, you should also pay attention to the other queries to try to maximize them.
Iv. Connections to nonclustered indexes
An indexed connection is a special case of an index crossover that applies an overlay index technique to an index crossover. If there are multiple indexes that can overwrite the query without an index of a single overwrite query, SQL Server can use an index connection to fully satisfy the query without needing to go to the base table.
A non-clustered index connection is actually a special case of a cross of a nonclustered index. Making multiple nonclustered indexes intersect can overwrite the data you want to query, making it possible to change from a basic table to a query without querying the base table at all.
--Create two nonclustered indexes, one in the Name column, one in the Insiteid column CREATE nonclustered index index_name on the person (Name) INCLUDE (age)--index or just the index, But contains more than one column CREATE nonclustered INDEX index_insiteid on person (Insiteid) include (Height)--ditto SELECT Name,age,height,insite Id from the person WHERE insiteid > 5155400 and Name = ' simple '--note the condition that the index connection is just enough to overwrite the required data to avoid finding the base table
View results:
What is the difference between index crosses and index joins? As mentioned earlier, indexed connections are special cases of cross-indexing. after the index connection is crossed, you do not have to go to the base table, and the bookmark lookup is missing one step. And after the index crosses, there is a step in the bookmark lookup to go to the base table to get the data, because the returned column of the index intersection does not fully conform to the column of select.
V. Filtering of nonclustered indexes
A filtered index is a nonclustered index that uses a filter, which is basically a WHERE clause that is used to create a highly selective keyword group on one or more columns that may not be well-selected.
For example, a column with a large number of null values might be stored as sparse columns to reduce the overhead of these null values. Adding a filtered index to this column will allow you to have an index on data that is not NULL.
In the person table that is used below, the name column has more than 50% null values and executes the query:
SELECT name,age from the person WHERE Name is not NULL
This is a clustered table scan and does not use indexes efficiently.
When we set up a nonclustered index, plus filter: INCLUDE () is to form an overlay index.
CREATE nonclustered index index_name on person (Name) INCLUDE (age) WHERE Name was not null--filter the rows of null values on the filtered index
In my database, it's not very different to build an index, not a filter (because unfortunately, the name column is basically not null), but when the filter condition is not NULL can filter a lot of data, then the role of filtering can be demonstrated . If you filter out a lot of data, you can definitely improve performance.
Filtered indexes return in many ways:
- Reduce index size and improve query efficiency;
- Build smaller indexes to reduce storage overhead;
- Reduced cost of index maintenance due to reduced size;