SQL Application and Development: (9) Improving efficiency indexes and SQL Indexes

Last Update:2015-05-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In a database, an index is a special object. It is a database structure that can accelerate data retrieval. It can quickly find the desired content from a large amount of data, this eliminates the need to retrieve the entire database during data query. An index is an object based on table data. Unlike a view, an index occupies physical storage. By using database indexes, we can quickly query data.

1. Introduction

An index is a way to represent data. The data sequence provided by the index is not used for the physical storage order of data on disks. An index is created based on a combination of one or more columns in a table. The physical location of records is rearranged in the table. When an index is used, data is provided to users in a sorted manner. You can use the create index statement to control the order of data sorting. Generally, you can create an index on the correct column to significantly improve the database performance, especially when creating an index on the join column between tables.

The function of indexing is similar to that of a book catalog. If the data in the data table is viewed as the content of the book, the index is the directory of the book. The book directory points to the book content (by page number ). Similarly, an index is a key value of a table. It provides a pointer to the row in the table. The page number in the directory is the direct path to the book content, and the index is the direct path to the table data, so as to access the data more efficiently.

In the database relationship diagram, you can create, edit, or delete each index type on the index/Key Attribute page for the selected table. When the table attached to the index or the database graph containing the table is saved, the index is also saved.

1.1 advantages

An index is an independent and physical database structure. database users can use indexes to quickly access specific information in database tables.

Generally, you must create an index on the table only when you frequently query data in the index column. Indexes occupy disk space and speed up adding, deleting, and updating rows. However, in most cases, the advantage of data retrieval brought by indexes is far greater than its disadvantage. However, if the application updates data frequently or the disk space is limited, it is best not to limit the index data.

In general, indexes are a storage structure created to accelerate the retrieval speed. An index is created for a table. It consists of an index page other than the data page. Each row on the index page contains a logical pointer that can be used to directly retrieve data, which accelerates the retrieval of physical data.

Whether or not to create an index for the columns in the table and what index to create have a great impact on the data retrieval speed. When an index is created, the column can respond to the query almost immediately, but it takes a long time to query a column without an index. For columns without an index, You need to search by row. the time consumed by this index is directly proportional to the data volume in the table.

Using indexes for queries has the following advantages:

Improves query speed: If a table does not have an index, the database will force a row-by-row search in the order of the table during the query. To find rows that meet the conditions, it takes a long time to access each row of the table. However, if the location information of the table to be queried is used to determine the rows in the table, the query time is shortened;

Increase the execution speed of connections, order by, and group:The join, order by, and group by operations require data retrieval. If an index is created in a table, the speed of join, order by, and group by operations will be greatly improved;

The query optimization analyzer depends on the index.: During query execution, the database automatically optimizes the query. Because the database optimization relies on indexes, after the index is created, the database will take the index based on the created index to make the index faster;

Uniqueness of enforced rows: Creating a unique index ensures that the data in the table is not duplicated.

In short, indexes can improve data query performance, but this requires a price. Indexed Data Tables occupy more storage space in the database. In addition, to maintain indexes, it takes longer to insert, update, or delete data commands. When designing and creating indexes, ensure that the performance is improved at a higher cost in terms of storage space and processing resources.

1.2 precautions

You can create one or more indexes on the base table based on the needs of the environment to provide multiple access methods and accelerate data retrieval. Generally, the database administrator or table owner is responsible for creating and deleting indexes. The system automatically selects an appropriate index as the access path when accessing data. You do not have to select an index.

Indexing is a double-edged sword. Although it accelerates the SELECT statement, it may also slow down the changes made to the data because the index in the SQL statement must be dynamically updated at runtime. No matter when the data in the table changes, all indexes in the table must be changed. If this results in re-indexing, a delay may occur when re-indexing occurs.

The following factors should be taken into account when creating an index:

I. If a large number of indexes are created for a table, the performance of INSERT, UPDATE, and DELECT statements will be affected because all indexes need to be adjusted when the data in the table is changed. On the other hand, for queries that do not need to modify data (SELECT Statement), a large number of indexes can help improve performance. Because the query analyzer has more indexes to choose from, in order to determine the best way to access data at the fastest speed;

Ii. We usually query the table through the primary key. Therefore, we should first consider creating an index on the primary key column. In addition, columns (including foreign keys) frequently used in connections should be considered for index creation;

Iii. indexing small tables may not produce any optimized results, because the query analyzer takes longer time to traverse the index to search for data than to scan a simple table.

To sum up, the use of index creation requires a certain amount of overhead. In addition, it takes time and storage space to maintain indexes when data is inserted and updated using INSERT or UPDATE. Therefore, there is no need to create an index for all columns in the table.

2. Type

Indexes can be divided into Clustered indexes and Non-clustered indexes based on the same Index order as the physical storage order of the database ). Both clustered and non-clustered indexes are created using the B-Tree structure and contain index pages and data pages. The index pages are used to store indexes and pointers to the next layer, data Pages are used to store records.

Based on the composition of index keys, indexes can be divided into unique indexes and composite indexes. A unique Index ensures that the Index column does not contain duplicate values. A composite Index is an Index that uses multiple columns in a table to Index data. In addition, there are other index types due to different SQL environments. For example, bitmap indexes and function indexes in Oracle.

Once an index is created, it is automatically managed and maintained by the database. For example, when you insert, update, or delete a record to a table, the database automatically modifies the index. When writing SQL query statements, there is no difference between tables with indexes and without indexes. The index only provides a way to quickly access a specified record. A data table has an index, which only affects the query speed of the table and does not affect other aspects.

2.1B-Tree Index

The top Node of the B-Tree index is called the Root Node, and the bottom Node is called the Leaf Node ), the Node between the root Node and the leaf Node is called an Intermediate Node ). The page links in each index are in the two-way link list. The B-Tree data structure starts from the root node and is arranged in a balanced left-right manner. data can be divided into multiple layers as needed. The B-Tree index can easily provide free space for updates.

It is the overall structure of B-Tree:

Because each leaf node has a linear order based on the contained index code value, you can use the pointer Pn of each node to link the leaf node in order based on the index code value. This sequence can efficiently process files sequentially, while other structures of B-Tree indexes can efficiently process files randomly.

2.2 clustered Index

Clustered indexes sort the key values of data rows in the table and store corresponding data records, so that the physical order of the table is consistent with the index order. A table can contain only one clustered index. If it is not a clustered index, the physical sequence of each row in the table and the logical sequence of key values do not match.

The B-Tree of the clustered index is built from the bottom up. A data page (the leaf node of the index page) contains a record, the index page of an intermediate node is generated by multiple data pages. Then, the index pages of multiple intermediate nodes are combined to generate the index pages of the top-level root node.

Is the data structure of the clustered index:

When creating a clustered index, it copies the table, sorts the data in the table, and then deletes the original table. Therefore, the database must have enough free space to accommodate data copies.

Note the following when creating a clustered index:

·Most tables should have clustered indexes or use partitions to reduce contention for the last table page. In a highly transactional environment, blocking the last page will seriously affect the system throughput.

·Under the clustered index, data is physically arranged on the data page in order, and duplicate values are also arranged together, therefore, when the queries that contain range checks (BETWEEN, <, <=,>,> =) or use group by and order, once the row with the first key value in the range is found, the row with the subsequent index value can be physically contiguous without further searching. This avoids large-scale scanning and improves the query speed.

·When you create a clustered index on a table with frequent insertion operations, do not place the index on a column with a monotonous increase (such as IDENTITY). Otherwise, blocking conflicts are often caused.

·Do not include frequently modified columns in the clustered index, because after the code value is modified, the data row must be moved to a new location.

·The type of the clustered index should be based on the WHERE clause and join operation.

The candidate key column of the clustered index is a primary key column, which is used in the WHERE clause and can be inserted randomly; columns accessed BY range; columns used in group by or order; columns that are not frequently modified; columns Used in connection operations.

2.3 non-clustered Index

Non-clustered indexes are similar to those of books. One location for data storage, and the index is stored in another location, with the storage location pointing to the data. The items in the index are stored in the order of the index key value, and the information in the table is stored in another order (this can be determined by the clustered index ). If no clustered index is created in the table, these rows cannot be in any specific order.

Non-clustered indexes share the same B-Tree structure as clustered indexes, but they have two major differences with clustered indexes:

·Data rows are not sorted and stored in the order of non-clustered index keys.

·The page layer of a non-clustered index does not contain data pages. On the contrary, a leaf node contains index rows. Each index row contains a non-clustered key value and one or more row locators. These row locators direct to data rows with this key value (if the index is not unique, it may be multiple rows ).

Non-clustered indexes have a completely independent structure of data rows. The lowest row of the non-clustered index contains the key value of the non-clustered index, and each key value item has a pointer to the data row containing the key value. Data rows are not stored in the order of non-clustered keys.

Non-clustered index data structure:

Pay attention to the following issues when using non-clustered indexes:

·How much space does the index need;

·Whether the column to be indexed is stable;

·How to select the index key and whether the scan effect is better;

·Whether there are many repeated values.

We often use non-clustered indexes in the following situations: columns commonly used in set functions and columns frequently used in JOIN, order by, and group by clauses.

2.4 unique and combined Indexes

A primary key is used to identify a unique row. It defines the relationship between two tables and is used to generate one or more columns of an index. Based on the composition of the index key, you can divide the index into three types: unique index, composite index, and overwrite index.

2.4.1 unique index

The unique index does not allow two rows to have the same index value. When multiple columns have unique indexes, this index ensures that each value combination in the index column is unique. If duplicate key values exist in existing data, most databases do not allow you to save the newly created unique index with the table. When the new data duplicates the key value in the table, the database also rejects the data. Defining a primary key for a table in the database relationship diagram will automatically create a primary key index, which is a special type of unique index. The primary key index requires that each value in the primary key be unique. When a primary key index is used in a query, it also allows quick access to data.

In a database table, if more than one row in a single column contains a NULL value, you cannot create a unique index on the column. In a column combination, if multiple columns contain NULL values, these NULL values are considered repeated values. Therefore, you cannot create a unique index on multiple such columns.

2.4.2 combined index and covered Index

As the name implies, a composite index is an index that uses more than one column in a table to index data. It is created by connecting two or more column values.

You can use 2 ~ The index of 16 columns is best when two or more columns are searched as a unit, or when many queries only reference the specified columns in the Index, a composite index should be used. All columns in the composite index must be in the same table, and the maximum size allowed by the composite index value is 900 bytes. That is, the total length of a fixed size column that composite the composite index cannot exceed 900 bytes.

When an index contains all the required information, this index is called an overwrite index. Its key value contains all data that meets the query requirements.

2.5 Other index types

You can create multiple types of indexes in database systems of different versions to meet the needs of different data tables. The following is a simple example:

Bitmap Index

In Bitmap indexes, index information is not stored in the way that index columns are worth it. A Bitmap index creates a bitmap for each unique index column. In this bitmap, a unit (Bit, value 0 or 1) is used to correspond to the column value of a record. If the bit element is 1, the column value corresponding to the bit element is a record containing the index column of the bitmap. The ing function in the bitmap index is used to implement the ing between bitwise elements and column values.

In this way, bitmap indexes can implement the same functions as B-Tree indexes with a completely different internal mechanism. If the value range of an index column only includes a few fixed values, the bitmap index will be very effective and save storage space.

3. Create

To create an index, you must know which indexes bring more benefits than overhead. The most basic command for creating an index is:

<span style="font-size:18px;color:#000099;"><strong>CREATE INDEX Index_nameON Table_name(Column_name1,Column_name2,...)</strong></span>

As we have encountered many times before, the syntax of the create index statement varies greatly in different database systems. However, all database systems support the above basic commands.

3.1 compound index

Composite index refers to an index that contains more than one column, also known as composite index. In SQL Server, up to 16 columns can be combined into one index.

For example, in the Database "jewelry sales system", query the information of a jewelry dealer in the city of Beijing. Because repeated values appear in the column values of the city in which the jeweler is located, the column is less selective. If you create a single column index in the column, in this case, the query performance is not very high. Note that each value in the commodity number column is a unique value. To improve the index selectivity, you can create a composite index of the city where the combined jeweler is located and the jeweler number. The Code is as follows:

<Span style = "font-size: 18px; color: #000099;"> <strong> city where create index is located _ IndexON jewelry dealer information (the city where the jewelry dealer is located, the jewelry dealer number) </strong> </span>

3.2 unique index and primary key index

The unique index ensures that the column does not contain duplicate values. When you add data to a table with a unique index, the database checks whether the newly added row has duplicate values in the column. If the newly added row and other existing rows have duplicate values in the unique index column, the database displays an error and revokes the input operation.

It makes sense to specify a unique index only when uniqueness is a feature of data. For example, when creating a database for the student status management system, we do not want to create a unique index for the Student name, because there may be students with the same name. The correct method is to create a student ID and create a unique index on the student ID field.

The method for creating a unique index is very simple. You only need to use the UNIQUE keyword in create index, which forms a unique index.

For example, the "Sales Management System" operator information table stipulates that an operator with the same name must be set without a password to ensure system security. Create a unique index on the password and operator columns:

<Span style = "font-size: 18px; color: #000099;"> <strong> create unique index operator security _ IndexON operator information (Operator name and password) </strong> </span>

Add a new operator to the table operator information. The newly added operator information is the already repeated data in the table.

<Span style = "font-size: 18px; color: #000099; "> <strong> </span> <pre name =" code "class =" SQL "> INSERT INTO operator information (operator ID, Operator name, password) VALUES (1009, 'zhou huifang ', '9698xie ')

If you execute the preceding statement, you will receive an error message prompting you not to use the INSERT command. This is because the data added by the INSERT command violates the "operator security _ Index" Index. Modify the password value added by the INSERT command.

<Strong> <span style = "color: #000099;"> insert into operator information (operator ID, Operator name, and password) VALUES (1009, 'zhou huifang ', '20140901') </span> </strong>

Primary key indexes are special types of unique indexes. A primary key index is automatically created when a primary key is defined for a table. To create a primary key in a table, you only need to use the KEY word primary key before column definition. The primary key index requires that each value in the primary key be unique. When a primary key index is used in a query, it does not allow quick access to data.

When defining a primary key or uniqueness constraint on a table, if the table already has a standard INDEX created using the create index statement, then the index created by the primary key constraint or uniqueness constraint will overwrite the Standard Index created previously. That is to say, the primary key INDEX or the unique constraint creates an INDEX with a higher priority than the INDEX created using the create index statement.

3.3 clustered and non-clustered Indexes

When a table uses clustered indexes, the data is stored in the same data block. Therefore, you only need to read a very small number of database blocks, resulting in faster execution performance. When clustered indexes are used, the physical arrangement of data in the table is modified. Clustered indexes are usually faster than non-clustered indexes.

Columns used to create clustered indexes are usually primary keys. In SQL Server, indexes are created by default as non-clustered indexes. Unless you use the keyword CLUSTERED to specify a CLUSTERED index, you can also use the keyword NONCLUSTERED to specify a non-CLUSTERED index.

For example, in the Database "Sales Management System", create a unique clustered index on the supplier name column of the supplier information table.

<Span style = "font-size: 18px; color: #000099;"> <strong> USE Sales Management System gocreate unique clustered index supplier name_indexon supplier information (supplier name) GO </strong> </span>

When a clustered index is created for a data table, the database sorts the index columns. The keyword ASC or DESC ensures the ascending or descending order of the index column. The default value is ASC in ascending order.

4. View and delete

In SQL Server, you can view and delete indexes in two ways: Enterprise Manager and SQL language.

You can use SQL Server to view indexes. Basic operations are not described here.

When an index is no longer needed, you can delete it from the database to recycle its current storage space. The recycled space can be used by any object in the database.

The index deletion method is related to the index creation method. If the INDEX is explicitly created using the create index statement, you can use the drop index statement to delete it. The syntax format is:

<span style="font-size:18px;color:#000099;"><strong>DROP INDEX 'table.index | view.index'[,...n]</strong></span>

Here, table and view indicate the table or view where the index is located; index indicates the name of the index to be deleted, and the index name must comply with the identifier rules; [,...] indicates that the previous item can be repeated n times.

For example, the following SQL statement is used to delete the INDEX "commodity _ INDEX" in "commodity information" of the Database "dealer ".

<Strong> <span style = "font-size: 18px; color: #000099;"> USE dealer godrop index product information. Item _ INDEX </span> </strong>

If you want to delete an index but cannot determine whether the index exists, you can also add a simple judgment statement to determine the existence of the index first.

For example, delete the index named "Customer Info table _ index" in the "Customer Info table" of the Database "dealer.

<Span style = "font-size: 18px; color: #000099;"> <strong> USE dealer if exists (SELECT name FROM sysindexes) WHERE name = 'customer info table _ Index' DROP index Customer Info table. customer Info table _ indexGO </strong> </span>

5. Learning Summary

I feel that I have learned another tool for the summary of the index chapter, and the role of this tool is exactly what we want-to improve efficiency. Especially during project creation, the data is too big to be indexed for search and utilization. I am very happy to learn and summarize the content of this chapter, and the results are quite good, as mentioned earlier, it is worthwhile to practice SQL. I would like to share it with you!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

SQL Application and Development: (9) Improving efficiency indexes and SQL Indexes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

SQL Application and Development: (9) Improving efficiency indexes and SQL Indexes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support