An index is a database object that is based on a table column. The indexed columns in the table are kept in the index, and the physical storage location of the index columns in the database table is recorded, and the logical ordering of the data in the table is achieved. By indexing, you can speed up the query speed of the data and reduce the response time of the system, which can speed up the connection between tables and tables.
However, this can be achieved by not using the index at any time. In inappropriate situations, using the index will backfire. Therefore, the use of the index in the SQL Server database, you still need to follow certain rules. I think, the main is to comply with the six iron law.
Iron Law One: there is no free lunch, the use of the index is to pay the price
The advantages of indexing are obvious, but few people care about the cost of indexing. If the database administrator can have a good understanding of the cost of indexing, it will not be so random to build the index everywhere.
Careful counting, in fact, the cost of building an index is quite large. It takes time and effort to create indexes and maintain indexes. Especially in the database design, the database administrator for the table in which fields need to be indexed, to investigate, to coordinate. If the records in the indexed table are added, deleted, and modified, the database will adjust the index. Although this working database will be completed automatically, the server's resources need to be consumed. The more data there is in the table, the more resources it consumes. If the index is an actual object in the database, each index takes up a certain amount of physical space. If the index is more, it will not only occupy a lot of physical space, but also affect the performance of the entire database.
As you can see, database administrators still have to pay a lot of price to use indexes to improve the performance of their systems. What the database administrator is now considering is how to achieve a balance between the two. Or, find a tipping point for return and devotion.
Iron Rule two: Do not create an index for columns that are rarely involved in a query or have more duplicate values
In the query, if we do not press a field to query, it is also a waste of indexing on this field. If you have an employee information form now, we may query employee information by employee number, employee name, or origin. However, we often do not follow the ID number to inquire. Although this ID number is unique. At this point, even indexing on this field will not increase the speed of the query. Instead, it increases system maintenance time and takes up system space. It's just shooting yourself in the foot.
In addition, for example, the Employee information table above, some fields have more duplicate values. such as the gender field is mainly "male", "female"; the position field is also a limited number of content. At this point, adding an index to these fields will not significantly increase the query speed and reduce user response time. Conversely, because of the need to occupy space, it will reduce the overall performance of the database.
The second iron rule in Database index management is that you do not index a column that is rarely involved in a query or has a greater number of duplicates.
Iron rule three: for columns that are queried by scope, it is best to establish an index
In the information management system, many times need to query certain transactions by scope. In the ERP system, often need to check the month's sales orders and sales shipments, which requires a date range to query transactions. If there is a time when the inventory is wrong, also need a period of time inventory access, such as the May 1 to December 3 inventory transactions and so on. At this point, the query is based on the date.
For these data columns that need to be queried quickly or frequently within a specified range, you need to index them. Because the index is sorted, the specified range is contiguous when it is saved, and the query can use the sort of index to speed up the query time and reduce the user waiting time.
However, if you may need to query by scope, however, if the scope of the query conditions to use a few circumstances, it is best not to use the index. If you are in the employee information sheet, you may need to inquire about the employee details for the March 2008, and increase the benefits for them. However, because there are not many records in the table, similar queries are rarely made. Jovis This field to index, although it's harmless, it's clear that the index gains less than its cost. For the database administrator, it is not worth the candle.
Furthermore, if you use the scope query, it is best to use the top keyword to limit the results of one query. For the first time, just show the previous 500 records in order, and so on. The top keyword with the scope of use, can greatly improve the efficiency of the query.
Iron Law Four: If you have a primary key or foreign key in the table, be sure to index it
To define an indexed column with a primary key, be sure to index it. Because a primary key can be accelerated to navigate to a row in the table. Combined with the role of the index, you can double the speed of the query. As in the Employee Information table, we often set the employee number as the primary key. Because this not only increases the speed of the query, but also guarantees the uniqueness of the employee number because the primary key requires the record to be unique. At this point, if the Employee Number field is set to the index, the employee number is used to query employees ' information, which is much more efficient than no index.
In addition, to make the value of a field unique, you can do this in two different ways. One is the primary key index mentioned above. There is also a unique index that uses the unique keyword to specify the uniqueness of the field's contents. Both of these methods automatically create a unique index on the specified column in the table. There is no obvious difference in the results of these two approaches. The query optimizer does not distinguish between the unique indexes in which they are built, and the way they query the data.
If the data column in a table defines a foreign key, it is best to index the field as well. Because the primary function of the foreign key is the connection query between the table and the table. If you establish an index on the foreign key, you can speed up the connection query between tables. For example, in the Employee profile table, there is a field for the employee position. Because the position of the employee is often changing, here, the store is actually a code for an employee position. Detailed information about the position is recorded in a separate position information form. At this point, the Employee position field is the foreign key. If you create a foreign key on this field, you can significantly increase the connection speed of both tables. Moreover, the more records, the more obvious the effect.
Therefore, it is best to index a table if it has a foreign key or a primary key. Through the index, can strengthen the role of primary key and foreign key, improve the performance of the database.
Iron Law Five: for some special data types, do not establish an index
In a table, some fields are special. such as text fields (TXT), Image type fields (images), and so on. If the fields in the table belong to these data types, it is best not to index them. Because these fields have some common features. If the length is indeterminate, it is either very long, several characters, or an empty string. For example, text data types are often used in database tables in the application system to make notes of the data type. Sometimes the notes are very long, but sometimes there is no data. If you build an index on this type of field, that's not going to work. On the contrary, it adds to the burden of the system.
So, on some of the more specific data types, it's prudent to build indexes. In general, it is not necessary to index it. However, there are also special circumstances. If sometimes, in ERP system, have product information This table, have a product specification this field. Sometimes, its length can be up to 5,000 characters long. At this point, only text-type data types can hold such a large amount of data. Moreover, in the query, the user also likes to use the specification of this parameter to query product information. At this point, if the field is not indexed, the query will be slow. When this happens, the database administrator indexes the system resources at the expense of a few.
From here also can see, although the above several say when iron law, but, whether need to follow, or need database administrator according to the actual situation of enterprise, make reasonable choice.
Iron Law VI: An index can be merged with a collection of where statements
Users often use restricted statements when querying for information. If you are querying a sales order, you often use a set of criteria for a customer and a single date, such as when you are querying a product's inventory transactions, using a set of criteria for the product number and the date from which the transaction date ends.
For these data columns that are often used in the WHERE clause, the index is built into the collection of where clauses, and for data columns that need to be accelerated or frequently retrieved, the data columns that are frequently involved in the query can be queried by the order of the indexes to speed up the time of the query.
In short, the index is like a double-edged sword, that can improve the performance of the database, but also may play a negative role in the performance of the database. As a database administrator, you have the ability to determine the right indexes at the right time, the right business, and the right fields. The above six iron laws are just some of the basic requirements for establishing an index.