I recently thought of a chat with a friend, and the question he asked me was a very common question: to say the difference between SQL clustered and nonclustered indexes.
AD:WOT2015 Internet operations and developers Conference selling tickets
In fact, for non-professional database operators, such as software developers, to a large extent, do not know the basic knowledge of database indexing, some are may not tress, or know it is not known why. The main reason for this is that I think industry reasons, there are many companies have their own DBA team, they will help you to optimize SQL, developers even do not understand the optimization problem is not big, so developers do not have much effort to understand the SQL optimization, but if the company does not have such a DBA, Only by the programmer himself. I recently thought of a chat with a friend, and the question he asked me was a very common question: to say the difference between SQL clustered and nonclustered indexes.
You may think that this problem is not very difficult, think too familiar with, may not be interested, but can you really speak it clearly? In fact, to say that the difference between the two is not 32 words to say the clear, that day I also feel that the problem is too broad, casually said that the two differences:
- Clustered index A table can have only one, but not the clustered index a table may exist more than one, this is no different from the problem, the average person knows.
- Clustered index storage records are physically contiguous, and non-clustered indexes are logically contiguous, and physical storage is not contiguous, as we all know.
The above two points from the large aspects of the talk, we continue to explore, to give an example of the actual point, a student table student, which is the student ID, student name, student's city ID, student scores (total).
Q: If you want to search by name, how do I optimize it?
A: Make an index on the first Name field.
Q: What type of index is built?
A: Build a nonclustered index.
Q: Why?
A: There is a general need for a range query, and you can consider creating a clustered index on this field.
Q: Is there a repetition of credit, can I create a clustered index on the credit field?
.... Meditation, cannot be created? " The previous project seems to have done so, a: it should be OK.
Q: What are the constraints of a clustered index?
A: Uniqueness?
Q: Since uniqueness, can I create a clustered index on the credit field?
.... Meditation again, should be able to ah, but the constraints of the index how to say? A: Should be able to, used before.
I think it is a database index knowledge has some research, but it may be two years without actual contact with SQL reason, I really can not think of a convincing explanation, friends see this can answer my question?
In fact, the above we need to understand the following several questions:
First: The constraints of a clustered index are unique, and is it required that the fields are unique?
Analysis: If it is considered a friend, may be affected by the system default settings, generally we specify a table's primary key, if the table does not have a clustered index, and when the primary key is not forced to specify the use of nonclustered indexes, SQL will default on this field to create a clustered index, and the primary key is unique, So it's natural to assume that the fields that create the clustered index need to be unique.
Conclusion: A clustered index can be created in any column you want to create on the field, which is theoretically, the actual situation can not be arbitrarily specified, otherwise the performance will be a nightmare.
Second: Why can a clustered index be created on any column, and if the table does not have a primary key constraint, it is possible that duplicate row data exists?
As a rough look, this is really the same as the constraints of the clustered index, but the actual situation is that you can create a clustered index.
The reason for this analysis is that if a clustered index is not created with the UNIQUE attribute, the database engine automatically adds a four-byte uniqueifier column to the table. When necessary, the database engine automatically adds a Uniqueifier value to the row so that each key is unique. This column and column values are for internal use and cannot be viewed or accessed by the user.
Third: Is the clustered index necessarily better than the nonclustered index performance?
If you want to query the credits and names of the students in the 60-90 credits, is it optimal to create a clustered index on the credits?
Answer: No. Since only two columns are output, we can create a federated nonclustered index on the credits and student names, at which point the index is an overlay index, that is, the content stored by the index is the final output of the data, which is better than learning to divide into a clustered index for query performance.
IV: What is the description of the clustered index versus the nonclustered index in the database?
The index is described in the form of a binary tree, so we can differentiate between aggregation and nonclustered indexes : The leaf node of the clustered index is the final data node, and the leaf section of the nonclustered index is still an index node, but it has a pointer to the final data.
V: In the primary key is the table that creates the clustered index on the data insert why is it slower than creating a nonclustered index table on the primary key?
With the understanding of the 4th above, we are sure to analyze this problem, insert data rows in a table with primary key, because of the constraints of the uniqueness of the primary key, it is necessary to ensure that the inserted data is not duplicated. Let's compare the lookup of the primary key for clustered and nonclustered indexes: Clustered index because the index leaf node is the data page, if you want to check the uniqueness of the primary key, you need to traverse all the data nodes, but the nonclustered index is different, because the nonclustered index already contains the primary key value, so look for the primary key uniqueness, Just traverse all the index pages, which reduces the IO consumption by traversing all data rows. This is why creating a nonclustered index on a primary key is a real reason to create a clustered index on a primary key faster than when inserting data.
OK, this, I do not know whether we really understand the focus of SQL index, I am also a novice database (from the use of time is not new, haha), unprofessional, what is wrong, I hope you criticize, the next I will analyze some database access index situation, there is a picture of the case, Maybe it's more clear to see.
Original link: http://www.cnblogs.com/mybluesky99/archive/2011/04/21/2023629.html
Can you tell the difference between a SQL clustered index and a nonclustered index?