Do software developers really understand SQL indexes (indexing Principles)

Source: Internet
Author: User

First twoArticleI have summarized some SQL database index problems. This article mainly analyzes the index's advantages and how to correctly use the index.

Advantages of indexes:This is obvious. Correct indexes can greatly improve the efficiency of data queries, sorting results, and grouping.
Disadvantages of indexing:The advantages and disadvantages are also obvious:
1: Creating an index requires additional disk space. The maximum index size is about 1.2 times the table size.
2: When modifying table data, such as adding, deleting, and updating table data, indexes must be maintained. This requires system overhead.
3: the unreasonable index design is not conducive to the system, but will degrade the system performance. For example, if we perform a range query on a column that creates a non-clustered index, the index of this column does not have any optimization effect. Instead, the index table needs to be maintained due to data modification, this affects the performance of data modification.

Actual Example: Take the student table of the first two articles as an example. You need to query the select * from student where score> 50 student information with scores above 50. The student table contains 100000 rows of records and credits are randomly generated. This guarantees data volume and data distribution.

First case: the student table has an index.
1: There is a clustered index, but the clustered index is not on credit. Here, only the analysis of credits is not a clustered index.
(1): the credits are not indexed. At this time, SQL queries data through clustered indexes, which is probably known to everyone.
(2): credits are indexed. In this case, will the SQL statement use the index on the secondary node? Not everyone can answer this question correctly. Since there is an index on the credits, and this column in The WHERE clause should have used the index, but the actual situation does not use the index. Because range lookup occurs, if an index is compared with an index, the performance is worse than simply searching all data by clustered index before filtering. When Will SQL give priority to indexing on credits? When score is specified as a specific value, you can use the credit index to search for it. The SQL Execution Plan is available.

2: No clustered index exists.
(1): There is no index on the shard, and other fields have indexes. In this case, a table scan occurs.
(2): If there is an index on the student Shard, will it be searched based on the index on credits? Because the data volume of the above table is also large, it is generally considered that SQL does not use table scanning, because it will find all records, however, the actual situation shows that SQL also uses table scans for range queries instead of querying by student index. We can also force SQL to query by credits, so we can clearly see that the forced use of credits as an index query is much worse than the table search performance.

Case 2: The Student table has no index. This situation has no value for analysis.


What fields are not suitable for index creation?
1: columns that are not frequently used. This index has more disadvantages than it has.
2: Logical fields, such as gender fields, have too many matching records, which is comparable to table scan.
3: fields with extremely large content, such as text, will greatly increase the space occupied by the index and the speed at which the index is updated.

We say that SQL consumes system resources when maintaining indexes. What resources does SQL consume when maintaining indexes? What problems will occur? How can I optimize the field index?

First, when the data page reaches 8 K (the maximum size of the data page is 8 K), data insertion or update will lead to page split.
1. Clustered index: the inserted and updated rows of the clustered index point to a specific page, which is determined by the clustered index keyword;
2. if there is only a heap, a new row can be inserted if there is space. Updating row data requires more space. If it is larger than the available space on the current page, the row is moved to the new page, and a forward pointer is left in the original position to point to the new row to be moved. If the row with the forward pointer is moved again, then the original pointer points to a new position;
3. Non-clustered indexes exist in the heap. Although insert and update operations do not split pages, non-clustered indexes still split pages.
Summary: No matter whether there is any index, a lot of data will be retained on the old page. Others will be placed on the new page, and the new page may be allocated to any available page. The page is split frequently, A table generates a large amount of data fragments, which directly results in a reduction in I/O efficiency.

Q: Why is the maximum varchar value set in the database to 8000 instead of 10000?
A: The maximum data page size is 8 K.

Second, fill in the factor for the solution of page paging caused by the above indexes.
When creating an index, you can specify a fill factor for the index to reserve a certain percentage of space on each leaf-level page of the index. In the future, data can be expanded and split pages can be reduced. The percentage value from 0 to 100. If the value is 100, the data page is filled. This setting is not useful only when data is not changed (such as read-only tables. The smaller the value, the larger the free space on the data page can reduce page split during index growth, but it requires more disk space. The fill factor cannot be set to too small. Too small will affect the SQL reading performance, because the fill factor will increase the number of data pages. Generally, the fill factor set by our company is 80.

Is the index unchanged?
As business changes and data changes, some indexes may be useful, for example:
1: In the past, the user name was used to search records, and now the business is changed to search by city where the user is located. At this time, we need to change the table index in real time to adapt to new business changes, that is, the data and usage modes have greatly changed.
2: unreasonable indexes before the system goes online. As data increases, the disadvantages become more and more obvious. At this time, you need to adjust the indexes.
3: As data increases, more and more pages are split, leading to lower index performance.

In the above situations, we need to re-build the index to completely solve the problem.

Summarize the index usage principles:
1: Do not index tables with a small amount of data. For small tables, the cost of table scanning is not high.
2: Do not set too many indexes. In a table without clustered indexes, you can set up to 249 non-clustered indexes. Excessive indexes will first increase disk space, in addition, the maintenance of indexes especially consumes performance when data is modified.
3: Apply composite indexes properly. In some cases, you can consider creating an overwriting index that contains all output columns.
4: Clustering indexes may be considered for fields frequently used for range query.
5: Avoid creating indexes for infrequently used columns, logical columns, and large field columns.

 

If something is wrong, you are welcome to correct it.

Author: Jiang min
Source: http://www.cnblogs.com/aspnet2008/

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.