I. Why should I create an index (advantage )?
Creating indexes can greatly improve the system performance.
First, By creating a unique index, You can ensure the uniqueness of each row of data in the database table.
Second, This can greatly speed up data retrieval, which is also the main reason for creating an index.
Third, It can accelerate the connection between tables, especially for Data Reference integrity.
Fourth, When you use grouping and sorting clauses to retrieve data, you can also significantly reduce the time for grouping and sorting in queries.
Fifth, By using indexes, you can use the optimizer during the query process to improve system performance.
Ii. unfavorable factors for establishing a direction index (disadvantage)
Some may ask: why not create an index for each column in the table because increasing Indexes has so many advantages? Although such an idea has its own rationality, it also has its own one-sidedness. Although indexes have many advantages, it is unwise to add indexes to every column in the table. This is because adding indexes also has many disadvantages.
First, It takes time to create and maintain indexes. This time increases with the increase of data volume.
Second, Indexes occupy physical space. In addition to data tables, each index occupies a certain amount of physical space. To create a clustered index, the required space is larger.
Third, When adding, deleting, and modifying data in a table, indexes must be maintained dynamically, which reduces the Data Maintenance speed.
Iii. Guidelines for creating direction Indexes
Indexes are created on certain columns in the database table. Therefore, when creating an index, you should carefully consider which columns can create an index and which Columns cannot create an index.
In general, you should create indexes on these columns.
First, You can speed up the search on columns that frequently need to be searched;
Second, In a column that acts as a primary key, the uniqueness of the column and the data arrangement structure in the organization table are enforced;
Third, These columns are usually used in connection columns. These columns are mainly foreign keys, which can speed up the connection;
Fourth, Create an index on a column that often needs to be searched by range. The specified range is continuous because the index has been sorted;
Fifth, Create an index on the columns that frequently need to be sorted. Because the index has been sorted, you can use the index sorting to speed up the sorting query time;
Sixth, Create an index on the columns in the WHERE clause frequently to speed up condition judgment.
Similarly, indexes should not be created for some columns. In general, these columns that should not be indexed have the following features:
First, Indexes should not be created for columns that are rarely used or referenced in queries. This is because, since these columns are rarely used, there is an index or no index, and the query speed cannot be improved. On the contrary, the addition of indexes reduces the system maintenance speed and space requirements.
Second, Indexes should not be added to columns with only few data values. This is because these columns have very few values, such as gender columns in the personnel table. In the query results, the data rows in the result set account for a large proportion of the data rows in the table, that is, the proportion of data rows to be searched in the table is large. Adding indexes does not significantly accelerate the search speed.
Third, Indexes should not be added for columns defined as text, image, and BIT data types. This is because the data volume of these columns is either large or small.
Fourth, When the modification performance is much higher than the retrieval performance, you should not create an index. This is because the modification performance and retrieval performance are inconsistent. When an index is added, the search performance is improved, but the modification performance is reduced. When the index is reduced, the modification performance is improved and the retrieval performance is reduced. Therefore, when the modification performance is much higher than the retrieval performance, you should not create an index.
Iv. How to create an index
There are multiple ways to create an index. These methods include directly creating an index and indirectly creating an index.
First, Directly create an index, such as using the create index statement or using the index creation wizard.
Second, Indirect index creation. For example, an index is also created when a table defines a primary key constraint or a unique key constraint.
Although both methods can create indexes, the specific content of the indexes they create is different.
Use the create index statement or the index creation Wizard to create an index. This is the most basic method for creating an index. In addition, this method is the most flexible and can be customized to create an index that meets your needs. When using this method to create an index, you can use many options, such as specifying the page fullness, sorting, and sorting statistics, to optimize the index. Using this method, you can specify the index type, uniqueness, and composite. That is to say, you can create a clustered index or a non-clustered index. You can create an index on a column, you can also create an index on two or more columns.
You can also create indexes indirectly by defining primary key constraints or uniqueness key constraints. A primary key constraint is a logic that maintains data integrity. It limits that records in a table have the same primary key record. When you create a primary key constraint, the system automatically creates a unique clustered index. Although, logically, the primary key constraint is an important structure, in terms of physical structure, the structure corresponding to the primary key constraint is a unique clustered index. In other words, in physical implementation, there is no primary key constraint, but only a unique clustered index. Similarly, an index is also created when a unique key constraint is created. This index is a unique non-clustered index. Therefore, when using constraints to create an index, the index type and features are basically determined, and there is little room for customization.
When you define a primary key or unique key constraint on a table, if the table already has a standard index created using the create index statement, then, the index created by the primary key constraint or the unique key constraint overwrites the previously created standard index. That is to say, the primary key constraint or the unique key constraint takes precedence over the index created using the create index statement.
5. Index features
An index has two features: a unique index and a composite index.
The unique index ensures that all data in the index column is unique and does not contain redundant data. If a table already has a primary key constraint or a unique key constraint, SQL Server automatically creates a unique index when creating or modifying a table. However, if uniqueness must be ensured, a primary key constraint or a unique key constraint should be created instead of a unique index. When creating a unique index, you should carefully consider these rules: when creating a primary key constraint or a unique key constraint in a table, SQL Server automatically creates a unique index. If the table already contains data, when an index is created, SQL Server checks the redundancy of existing data in the SQL Server checklist. Whenever you use an insert statement to insert data or use a modify statement to modify data, SQL Server checks data redundancy: if there is a redundant value, SQL Server cancels the execution of the statement and returns an error message. Make sure that each row of data in the table has a unique value, this ensures that each object can be uniquely identified. You can only create a unique index on a column that guarantees the integrity of the object. For example, you cannot create a unique index on the name column in the personnel table, because people can have the same name.
A composite index is an index created in two or more columns. When you search for two or more columns as a key value, it is best to create a composite index on these columns. When creating a composite index, consider these rules: You can combine up to 16 columns into a separate composite index. The total length of a composite index Column cannot exceed 900 bytes, that is to say, the composite Column Length cannot be too long. In composite indexes, all columns must come from the same table and cannot create Composite Columns across tables. In composite indexes, the order of columns is very important. Therefore, we must carefully sort the order of columns. In principle, we should first define the most unique column, for example, in (col1, col2) the index on is different from the index on (col2, col1) because the order of the two index columns is different. To enable the query optimizer to use a composite index, the where clause in the query statement must refer to the first column in the composite index. When multiple key columns exist in the table, the composite index is very useful. The composite index can improve the query performance, reduce the number of indexes created in a table.
6. Index types
You can divide an index into two types based on the order of the index and the physical order of the data table. One is the clustered index with the same physical order and index order as the data table, and the other is the non-clustered index with different physical order and index order of the data table.
VII. Architecture of clustered Index
The index structure is similar to the tree structure. The top of the tree is called the leaf level. The rest of the tree is called the non-leaf level, and the root of the tree is in the non-leaf level. Similarly, in clustering indexes, the leaf-level and non-leaf-level of clustering indexes constitute a tree structure, and the lowest level of the index is the leaf-level. In a clustered index, the data page of the table data is at the leaf level, the index page on the leaf level is at the non-leaf level, and the index page on the index data is at the non-leaf level. In clustering indexes, data values are always sorted in ascending order.
Create a clustered index for frequently searched columns in the table or columns accessed in sequence. When creating a clustered index, consider these factors: Each table can have only one clustered index, because the physical order of the data in the table can only be one; the physical order of the row in the table is the same as that of the row in the index. You can create a clustered index before creating any non-clustered index, this is because the clustering index changes the physical order of the rows in the table. The data rows are arranged in a certain order and the order is automatically maintained. The uniqueness of key values is either explicitly maintained using the unique keyword, either it is explicitly maintained by an internal unique identifier. These unique identifiers are used by the system and cannot be accessed by users. The average size of the clustered index is about 5% of the data table size. However, the actual size of the clustered index often varies according to the size of the index column. During index creation, SQL Server temporarily uses the disk space of the current database, when creating a clustered index, it requires 1.2 times the size of the tablespace. Therefore, make sure you have enough space to create a clustered index.
When the system accesses the data in the table, first determine whether there is an index in the corresponding column and whether the index is meaningful to the data to be retrieved. If the index exists and makes sense, the system uses the index to access records in the table. The system browses data from the index, and the index browses starts from the root of the tree index. From the root, compare the search value with each key value to determine whether the search value is greater than or equal to the key value. This step repeats until a key value greater than the search value is met, or the search value is greater than or equal to all the key values on the index page.
How does the system access table data?
Generally, you can use two methods to access data in a database: Table scan and index search. The first method is table scanning, which means that the system places the pointer on the data page where the table's header data is located, and then sorts the data pages according to the order, scan all the data pages occupied by the table from the front to back one page until all the records in the table are scanned. During scanning, if a record that meets the query conditions is found, this record is selected. Finally, all records that meet the query statement conditions are selected and displayed. The second method is to use index search. An index is a tree structure that stores keywords and pointers to data pages containing records where keywords are stored. When an index is used for search, the system finds records that meet the query conditions based on the keywords and pointers in the index along the tree structure of the index. Finally, all the records found that meet the query statement conditions are displayed.
In SQL Server, when accessing data in the database, SQL Server determines whether an index exists in the table. If no index exists, SQL server uses the table scan method to access data in the database. The query processor generates an optimization execution plan for the query statement based on the statistical information of the distribution to improve data access efficiency. determine whether to use table scanning or indexes.