(Excerpt) _ Database Design entry classic: Building a fast-executed database model _ 8.4 efficient index for improving performance

Source: Internet
Author: User
8.4 efficient index for performance improvement
Most importantly, creating an index is more cautious than not creating an index. Many databases are complicated due to the excessive use of indexes. After a long time, the reasons for creating indexes are unknown. Of course, do not be afraid to create indexes. The correct idea is that an existing index does not indicate that the index should exist.
8.4.1 index type
Some databases have different types of indexes. One of the most common index types in relational databases is binary tree (B-tree) indexes. Other index types are rarely seen and only apply to some special cases. Pay attention to requirements and consequences when using special types of indexes, such as isam indexes, hash indexes, and bitmap indexes.
Tip:
Different database engines use index structures and indexing methods in different ways. Several examples are listed here. Some database engines allow the creation of separated physical isam indexes, while some databases use the isam algorithm to access the btree indexes. Some databases allow the creation of hash indexes as the physical structure for separation, while other databases only allow the use of hash algorithms for all fields in the entire table. Some database engines allow creating B-tree indexes as indexes and sorting the entire table. In this way, the table and all its fields will become the B-tree index (referred to as the cluster index or index organization table ).
The application of the preceding indexes is special and not commonly used. Also, when you modify the source table, such infrequently used indexes often cause overflow. In fact, the performance of infrequently used index types is often compromised by overflow. Overflow refers to the situation where the index structure created for performance changes completely after data changes in the table and some indexes cannot be used. Most of these infrequent index types are not suitable for read-only environments. Therefore, you need to consider and use these indexes with caution before using these indexes.
The database administrator should pay close attention to the indexes in the database. Of course, there is usually not enough time to check, but it is best to check all the indexes before releasing the application. Developers often create many indexes. Sometimes, in the same application, each developer creates a custom index group, resulting in excessive indexing. Too many indexes in the table may cause performance problems. In fact, when you execute table change commands (such as insert, update, or delete commands) on a table, all indexes in the table are also executed in addition to the table. For example, if one record is inserted into a table with four indexes, it actually contains five database changes.
8.4.2 indexes in actual conditions
When creating an index in a table, there are many ways to create an index based on the features of the table:
● No index --
The data in the table is in a heap structure (heap or non-organizational heap ). No index is good for both small and large tables. Because small tables have less physical space to access, they do not have to access tables and indexes without indexing, thus improving the overall access performance. The read performance of large tables is only related to application requirements. If you have never used an index before, you do not need to use it. Sometimes it is also good to discard some of the reference integrity keys and indexes.
● Static data --
When no index is available, the performance of small static data tables is always better. But pay attention to two potential problems: deleting a foreign key index can cause serious locking problems, which greatly reduces performance. Even in small static tables, creating an index for all tables (especially using a unique primary key index) is often conducive to highly complex multi-Table connections.
● Dynamic data --
Dynamic Data is the data that changes at all times (such as transaction data ). The index of this type of data changes frequently, which is prone to overflow and requires frequent refresh. Exercise caution when using indexes for dynamic data. In a specific database, the most appropriate default index type is selected for dynamic data. The default index type is usually a binary tree index structure. When the data changes, other index types, including pre-computed structures such as isam, hash, and bitmap, will almost immediately overflow.
Tip:
Index overflow is bad for performance. When indexes of some index types overflow, some modifications to the table data cannot be placed in the correct position of the original structure of the table. This is determined by how these indexes work. If an overflow occurs when a large amount of data is modified, a large amount of random I/O will occur in disk memory to search for data. This will cause very serious performance problems.
● Read-only report index --
Different from dynamic data indexes, the indexes of read-only data are more flexible because the data is not easy to change. In data warehouses, read-only indexes are often designed specifically for read-only queries. This type of index is a pre-built structure that is prone to overflow, but it is very efficient for Read-Only I/O activities. Read-Only indexes (bitmap, cluster, and hash) are inefficient in high dynamic environments.
● Unique and non-unique indexes --
A unique index is an index that allows only one value in a table. Exercise caution when using a unique index, because inserting and updating a unique index field requires scanning the entire index space (verifying the uniqueness of the value ). A non-unique index allows multiple records to have the same value, such as a foreign key index. A unique index can improve performance and is generally used for primary keys. The reason for improving the performance of a unique index is that you can search for the index subset to find records. Theoretically, you can reduce the I/O activity and index structure traversal.
● Single-field index and multi-field index --
Multi-field indexes are also called compound field indexes. Single-field indexing is more efficient than compound multi-field indexing. The reason is simple. The fewer fields, the smaller the search range. Similarly, if the number of fields is small, the index is much smaller than the parent table. The larger the size difference between a table and an index, the greater the effect of the index on the reduction of I/O activities. This is especially true for large tables.
● Index data type --
It is best to use an integer. An integer refers to the number without a number on the right of the decimal point. The length and content of other data types are always more variable. A fixed-length string is not as efficient as an integer. However, if the number of characters in a string is small, it can also be used to construct an index, such as code. Code names are commonly used to represent structures, such as the names of American States. For example, NY represents new york and Ca represents California. The number is still better, because there are only 10 digits. There are 26 different types of Characters in the alphabet, plus 10 types of numbers (a string can contain both letters and numbers), plus all kinds of punctuation and other special characters.
● Sacrifice the integrity of the reference for performance --
Sometimes this is a good solution, but not in most cases. Deleting a foreign key index can cause serious locking problems. Verify the relationship between related tables by using primary keys and Foreign keys with reference integrity. If a large amount of verification is required, and foreign keys in the table do not have foreign key indexes, the entire sub-table will be frequently scanned. In this way, not only will the index compete for use, but the entire table will have a huge competition.
● Secondary index adjustment-secondary indexes are usually called secondary indexes.
Indexing ). Secondary indexes include the indexes created for tables in the database model, and these indexes are not part of the integrity constraints. In many cases, secondary indexing does not match the functional requirements of database models and applications. Too many secondary indexes mean that the database model cannot meet application requirements. Report applications or data warehouse applications usually require more secondary indexes.
8.4.3 no index is used
Maybe the most important question is when should I not create an index? In some cases, indexes may compromise performance and therefore should not be used. Sometimes (in fact, it is very common), ignoring the index and reading the entire table can improve the query performance. The following describes how to avoid using indexes.
● If the number of fields in the table is small and the percentage of records read from the table each time is large, the index will not be good for the table. Because creating an index does not reduce the number of full table scans. Note: It is not a good practice to delete the primary key or foreign key.

When reading a small static data table, it is faster to scan the table than to scan the index before referencing the table data. For example, underlying I/O activities of operating systems (UNIX, windows, Linux, and other operating systems) are sometimes read by block or by page. There are many types of page sizes, but at the database level, the page size is usually 2 kb, 4 kb, 8 KB, 16 kb or even 32 KB. The problem to be explained is that if the number of records in a table is small enough to be placed on a single physical page, it is meaningless to read the index and point it to the table. Reading the index and then reading the table is equivalent to reading two pages, while Directly Reading the table only has one I/O activity on one page.
● Generally, tables created for reports or during periodic data warehouse append (batch update) are sorted as required.

Generally, you should create an index for some fields in the table. Compared with the original table, large composite indexes may be relatively large. The relative size of indexes and tables is very important. The larger the ratio of the index size to the actual table size, the more physical space to be read. The smaller the difference, the smaller the help value. Fields that contain null values may also aggravate this situation. If the table contains a large number of null values, reading the entire table is faster than reading a large composite field index. Therefore, you do not need to create a composite index that contains all fields. Because the compound field structure requires all fields, you can sometimes create duplicate value indexes. Partial indexes are more efficient than non-indexed indexes, and sometimes higher than composite field groups of composite field indexes.
This is the index. Indexes are very important for database modeling and overall database performance. In some cases, no index is used or even the importance of indexes. Just as using an index does not reduce performance, over-using an index will also reduce performance.
The following describes how to use a view.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.