Do software developers really know about SQL indexes (clustered indexes)

Source: Internet
Author: User

In fact, for non-professional database operators, such as software developers, some of the basic knowledge about database indexing are largely unknown, or do not know why. I think the main cause of this situation is the industry. Many companies have their own DBA teams and they will help you optimize SQL statements. Even if developers do not understand the optimization problems, therefore, developers will not spend too much effort on SQL Optimization in this regard, but if the company does not have such a DBA, it can only rely onProgramMember himself. Recently I suddenly remembered a chat with a friend. The question he asked me was a very common one: the difference between SQL clustered indexes and non-clustered indexes.

You may think that this problem is not very difficult. You may not be interested in it because you are too familiar with it. But can you really make it clear? In fact, if we want to explain the differences between the two, we can't just say anything about them in three or two sentences. I also thought that the problem was too general that day, so I just talked about the two differences:

First, clustered indexes can only have one table, but not clustered indexes can have more than one table. This is no different from the normal one.
Second, clustered index storage records exist physically consecutively, rather than clustered indexes are logically consecutive and physical storage is not consecutive. We all know this.

The above two points are all about in the big aspect. We will continue to discuss them later. For example, a student table student contains the student ID and Student name, student's city id, Student Score (total score ),
Q: How can I optimize queries by name?
A: Create an index on the name field.
Q: What types of indexes are created?
A: create a non-clustered index.
Q: What if I want to search by student's credit range? For example, if I want to search for a student whose score is between 60 and 90?
A: Create a clustered index on the credit field?
Q: Why?
A: Generally, range query is required. You can create a clustered index on this field.
Q: credit is repetitive. Can I create a clustered index on the credit field?
... Meditation, cannot it be created? The previous project seems to have done this.
A: Yes.
Q: What are the constraints of clustered indexes?
A: What is uniqueness?
Q: Since it is unique, can I create a clustered index on the credit field?
... Think twice, you can, but what are the index constraints?
A: Yes. I used it before.

I think I have some research on database indexing knowledge, but it may be because I haven't actually been in touch with SQL for two years. I can't think of a convincing explanation at the moment, can my friends see this to answer my questions?

In fact, we need to clarify the following issues:

First, the clustered index constraint is unique. Is it required that fields be unique?
Analysis: If you think it is a friend, it may be affected by the system's default settings. Generally, we specify the primary key of a table. If the table has no clustered index, at the same time, the non-clustered index is not forcibly specified when the primary key is created. SQL creates a clustered index on this field by default, and the primary key is unique, therefore, it is assumed that the fields used to create the clustered index must be unique.
Conclusion: clustered indexes can be created on any field you want to create in any column. Theoretically, they cannot be specified in actual conditions. Otherwise, they may be a nightmare in performance.

Second, why can a clustered index be created on any column? If this table has no primary key constraint, there may be duplicate rows of data?
From a rough look, this is really different from the constraints of clustered indexes, but the actual situation can really create clustered indexes, the reason is: if the unique attribute is not used to create a clustered index, the database engine automatically adds a four-byte uniqueifier column to the table. If necessary, the database engine automatically adds a uniqueifier value to the row to make each key unique. This column and column value are used internally and cannot be viewed or accessed by users.

Third: are clustered indexes necessarily better than non-clustered indexes?
If you want to query the credits and names of students whose credits are between 60 and 90, is it optimal to create a clustered index on the scores?
A: No. Since only two columns are output, we can create a joint non-clustered index on credits and student names. At this time, the index forms a covered index, that is, the content stored by the index is the final output data. This index provides better query performance than the clustered index with credits.

Fourth: What is used in the database to describe clustered indexes and non-clustered indexes?
Indexes are described in the form of Binary Trees. We can distinguish clustering from non-clustered indexes in this way: the leaf node of the clustered index is the final data node, the leaf segment without clustered indexes is still an index node, but it has a pointer to the final data.

Fifth: Why is it slower to create a non-clustered index table on the primary key than to create a non-clustered index table on the primary key during data insertion?
With the understanding of the fourth point above, we are sure to analyze this problem. To insert data rows into a table with a primary key, we need to ensure that the inserted data is not duplicated due to the uniqueness constraint of the primary key. Let's compare the search condition where the primary key is clustered index and non-clustered index: clustering index because the index leaf node is a data page, if you want to check the uniqueness of the primary key, you need to traverse all data nodes, different non-clustered indexes, because non-clustered indexes already contain the primary key value, you only need to traverse all the index pages to find the primary key uniqueness, this reduces Io consumption compared to traversing all data rows. This is why creating a non-clustered index on the primary key is faster than creating a clustered index on the primary key when inserting data.

Well, let's talk about this. I don't know if you really understand the SQL-focused index. I am also a newbie to the database (it's not new in terms of use time, haha). I'm not professional, what's wrong? I hope you can criticize and correct it. In the next article, I will analyze some database access indexes. In the case of graphs, I may be more clear.

Modified in this article:The B-tree mentioned in this Article refers to the Balance Tree.AlgorithmI don't know enough, so I hope you can correct me more. Thank you for your correction.

Author: Jiang min
Source: http://www.cnblogs.com/aspnet2008/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.