The index of MySQL is the index introduced by the database index application database.
The most frequently performed operation on a database is data query. In general, the database needs to search the entire table for data during the query operation. When a table contains a large amount of data, it takes a long time to search for the data, which leads to a waste of server resources. To improve data retrieval capabilities, the database introduces an index mechanism.
An index metaphor
To some extent, we can regard a database as a book, an index as a book directory, and search for information in a book through a directory. Obviously, it is easier and faster than a book without a directory.
What is the actual database index? (Composed of two parts)
An index is a separate, physical database structure. it is a set of one or more column values in a table and a logical pointer list pointing to the data page that physically identifies these values in the table.
Role of the index in the table
A table is stored in two parts. one part is used to store the data page of the table and the other part is used to store the index page. The index is stored on the index page,
Indexing efficiency principle
Generally, index pages are much smaller than data pages. For data retrieval, the system first searches for the index page, finds the pointer to the required data, and then directly reads the data from the data page through the pointer.
Index Category
In SQL Server databases, indexes are divided into two types based on different storage structures: cluster Index and non-cluster Index ).
The cluster index sorts the data on the physical data page of the table by column and stores the data on the disk again, its leaf nodes store actual data. Because the cluster index sorts the data in the table one by one, it is very fast to search for data using the cluster index. However, because the cluster index completely sorts all the data in the table, it requires a very large space, which is approximately equivalent to 120% of the table's data space. The data rows of a table can only be stored on the disk in one sort mode. Therefore, a table can only have one cluster index.
Non-clustered indexes have a structure completely separated from table data. using non-clustered indexes, you do not need to sort the data on the physical data page by column. Leaf nodes with non-clustered indexes store the values and row locators of keywords that constitute non-clustered indexes. The structure and content of the row positioner depend on the data storage method. If the data is stored as a cluster index, the row locator stores the index key of the cluster index. if the data is not stored as a cluster index, this method, also known as Heap Structure, stores a pointer to a row of data. Non-clustered indexes sort row locators by keyword values in a certain way. this order does not match the sorting of table rows on the data page. Because non-cluster indexes use index page storage, it requires more storage space than cluster indexes and has low retrieval efficiency. However, one table can only create one cluster index, when you need to create multiple indexes, you need to use non-clustered indexes.
Summary: Clustered Index is mixed with physical data and rearranged physical data, just like using pinyin to query the dictionary. Unclustered Index is completely separated from physical data, the extra space is used to rearrange the keywords, just as the dictionary is used by the department heads.
Database index application
I. Index concept
Index is the method to accelerate the retrieval of table data. A database index is similar to a book index. In books, indexes allow users to quickly find the desired information without having to read the complete book. In the database, the index also allows the database program to quickly find the data in the table without scanning the entire database.
II. index features
1. indexing can speed up database retrieval
2. indexes reduce the speed of maintenance tasks such as database insertion, modification, and deletion.
3. indexes are created on tables and cannot be created on views.
4. indexes can be created directly or indirectly.
5. You can use indexes in optimization hiding.
6. use the query processor to execute SQL statements. only one index can be used at a time on a table.
7. others
III. Advantages of indexes
1. create a unique index to ensure the uniqueness of each row of data in the database table
2. the data retrieval speed is greatly accelerated, which is also the main reason for index creation.
3. accelerate the connection between tables, especially for data reference integrity.
4. when you use grouping and sorting clauses to retrieve data, you can also significantly reduce the time for grouping and sorting in queries.
5. by using indexes, you can use the optimizer during the query process to improve system performance.
IV. index disadvantages
1. it takes time to create and maintain indexes. this time increases with the increase of data volume.
2. indexes occupy physical space. in addition to data tables, each index occupies a certain amount of physical space. to create a clustered index, the required space will be larger.
3. when adding, deleting, and modifying table data, the index must be maintained dynamically, reducing the data maintenance speed.
V. index classification
1. directly create indexes and indirectly create indexes
Directly CREATE an INDEX: create index mycolumn_index ON mytable (myclumn)
Indirect index creation: defines the primary key constraint or unique key constraint. indexes can be created indirectly.
2. common and unique indexes
Normal INDEX: create index mycolumn_index ON mytable (myclumn)
Unique index: ensure that all data in the index column is unique and can be used for clustered indexes and non-clustered indexes.
Create unique coustered index myclumn_cindex ON mytable (mycolumn)
3. single index and Composite Index
Single Index: non-composite index
Composite Index: Also called Composite Index. The index creation statement contains multiple field names, up to 16 fields.
Create index name_index ON username (firstname, lastname)
4. clustered index and non-clustered index (clustered index and clustered index)
Clustered index: physical index. the physical order is the same as that of the base table. the data values are always sorted in order.
Create clustered index mycolumn_cindex ON mytable (mycolumn)
ALLOW_DUP_ROW (clustering indexes with repeated records are allowed)
Non-clustered INDEX: create unclustered index mycolumn_cindex ON mytable (mycolumn)
VI. Use of indexes
1. when the field data update frequency is low, the query frequency is high and there are a large number of duplicate values. we recommend that you use clustered indexes.
2. multiple columns are frequently accessed at the same time, and each column contains duplicate values. you can consider creating a composite index.
3. the pre-column of the composite index must be properly controlled; otherwise, the index will not work. If the leading column is not in the query condition during the query, the composite index will not be used. The leading column must be the most frequently used column.
4. before a multi-table operation is executed, the query optimizer will list several possible connection solutions based on the connection conditions and find the best solution with the minimum system overhead. The join conditions must fully consider the tables with indexes and tables with multiple rows. the selection of the internal and external tables can be determined by the formula: Number of matched rows in the outer table * Number of times each query is performed in the inner table, minimum product is the best solution
5. any operation results on the column in The where clause are calculated by column one by one during SQL execution. Therefore, it has to perform table search without using the index on the column; if these results are obtained during query compilation, they can be optimized by the SQL Optimizer and indexed to avoid table search (for example, select * from record where substring (card_no) = '000000'
& Select * from record where card_no like '000000') any column operations will cause table scans, including database functions and calculation expressions, when querying, try to move the operation to the right of the equal sign
6. the 'in' in the where condition is logically equivalent to 'or', so the syntax analyzer will set in ('0', '1 ′) convert to column = '0' or column = '1' for execution. We expect it to search for each or clause separately and then add the results so that the index on the column can be used. However, it actually uses the "or policy ", that is, first extract the rows that satisfy each or clause, store them to the worksheet of the temporary database, create a unique index to remove duplicate rows, and finally calculate the results from this temporary table. Therefore, the index on the column is not used in the actual process, and the completion time is also affected by the performance of the tempdb database. The in and or clauses usually use worksheets to invalidate the index. if a large number of duplicate values are not generated, consider splitting the clause. the split clause should contain the index.
7. be good at using stored procedures to make SQL more flexible and efficient