MySQL detailed query optimization of-----------massive database

Source: Internet
Author: User
Tags informix

Many programmers think that query optimization is the task of the DBMS (Database Tutorial management System), which is not related to the SQL statements written by the programmer, which is wrong. A good query plan can often improve the performance of the program by dozens of times times. A query plan is a collection of SQL statements that are submitted by a user, and query planning is a collection of statements that are produced after optimized processing. The process for the DBMS to process the query plan is that, after the lexical and grammar checking of the query statement is done, the statement is submitted to the DBMS's query optimizer, the optimizer completes the optimization of the algebraic optimization and access path, and the pre-compiled module processes the statement and generates the query plan. It is then submitted to the system for processing execution at the appropriate time, and the execution results are returned to the user. In the higher versions of the actual database products (such as Oracle, Sybase, and so on) are cost-based optimization methods that can estimate the cost of different query plans based on the information obtained from the System Dictionary table, and then choose a better plan. Although the current database products in the query optimization has been done more and more well, but the user-submitted SQL statement is the basis of system optimization, it is difficult to imagine a bad query plan after the system optimization will become efficient, so the quality of the written statement is very important. The following highlights solutions to improve the query plan.

Solve the problem

The following is an example of a relational database system, Informix, which describes ways to improve the user's query plan.

1. Fair use Index

An index is an important data structure in a database, and its fundamental purpose is to improve query efficiency. Most database products now use the ISAM index structure first proposed by IBM. The use of indexes is just right, and the following principles are used:

Indexes are established on columns that are frequently connected but not specified as foreign keys, and fields that are not frequently connected are automatically indexed by the optimizer.

An index on a column that is frequently sorted or grouped (that is, a group by or order by operation).

Build a search on columns that are often used in conditional expressions with more different values, and do not index on columns with fewer values. For example, in the employee table, the "Gender" column is only "male" and "female" two different values, so there is no need to build an index. If you build an index, it will not improve the query efficiency, but can seriously reduce the update speed.

If there are multiple columns to sort, you can create a composite index on those columns (compound index).

Use System Tools. If the Informix database has a Tbcheck tool, it can be checked on a suspicious index. On some database servers, indexes can be invalidated or slow to read because of frequent operations, and if a query that uses an index is slowly slowing down, try using the Tbcheck tool to check the integrity of the index and fix it if necessary. In addition, deleting and rebuilding an index can improve query speed when the database table updates a large amount of data.

2. Avoid or simplify sorting

You should simplify or avoid repeating the ordering of large tables. The optimizer avoids sequencing steps when it is possible to automatically generate output in the appropriate order using the index. Here are some of the factors that affect:

The index does not include one or several columns to be sorted; The order of the columns in the group BY or ORDER BY clause is different from the order of the indexes;

The sorted column comes from a different table.

In order to avoid unnecessary sorting, it is necessary to construct the index correctly and merge the database tables reasonably (although sometimes it may affect the normalization of the table, but it is worthwhile to improve the efficiency). If sorting is unavoidable, you should try to simplify it, such as narrowing the range of sorted columns.

3. Eliminate sequential access to large table row data

In nested queries, sequential access to a table can have a fatal effect on query efficiency. For example, a sequential access strategy, a nested 3-tier query, if each layer query 1000 rows, then the query will query 1 billion rows of data. The primary way to avoid this situation is to index the concatenated columns. For example, two tables: Student table (school number, name, age ...). ) and select the timetable (school number, course number, results). If two tables are to be connected, an index will be created on the connection field of the "Learning number".

You can also use a set to avoid sequential access. Although there are indexes on all the check columns, some forms of where clauses force the optimizer to use sequential access. The following query forces a sequential operation on the Orders table: (www.111cn.net)

SELECT * FROM Orders WHERE (customer_num=104 and order_num>1001) OR order_num=1008

Although indexes are built on Customer_num and Order_num, the optimizer uses sequential access paths to scan the entire table in the above statement. Because this statement retrieves a collection of detached rows, it should be changed to the following statement:

SELECT * FROM Orders WHERE customer_num=104 and order_num>1001

UNION

SELECT * FROM Orders WHERE order_num=1008

This enables the query to be processed using the index path.

4. Avoid correlated subqueries

When a column's label appears in both the main query and the query in the WHERE clause, it is likely that the subquery must be queried again once the column values in the main query have changed. The more nested levels of queries, the lower the efficiency, so you should avoid subqueries as much as possible. If the subquery is unavoidable, filter out as many rows as possible in the subquery.

5. Regular expressions to avoid difficulties

The matches and like keywords support wildcard matching, which is technically known as a regular expression. But this is a particularly time-consuming match. Example: SELECT * from the customer WHERE zipcode like "98_ _ _"

Even if an index is established on the ZipCode field, sequential scanning is also used in this case. If you change the statement to select * from Customer where zipcode > "98000", the index will be used to query when executing the query, which obviously will greatly increase the speed.

Also, avoid non-starting substrings. For example, the statement: SELECT * from Customer where zipcode[2,3] > "80", in the WHERE clause takes a non-starting substring, so the statement will not use the index.

6. Accelerating queries with temporal tables

Sorting a subset of tables and creating temporary tables can sometimes speed up queries. Helps to avoid multiple sorting operations, and in other ways simplifies the work of the optimizer. For example:

SELECT cust.name,rcvbles.balance,......other Columns

From Cust,rcvbles

WHERE cust.customer_id = rcvlbes.customer_id

and rcvblls.balance>0

and cust.postcode> "98000"

ORDER by Cust.name

If the query is to be executed more than once, all unpaid customers can be found in a temporary file and sorted by the customer's name:

SELECT cust.name,rcvbles.balance,......other Columns

From Cust,rcvbles

WHERE cust.customer_id = rcvlbes.customer_id

and rcvblls.balance>0

ORDER by Cust.name

Into TEMP cust_with_balance

Then query in the temporary table in the following way:

SELECT * from Cust_with_balance

WHERE postcode> "98000"

The rows in the staging table are less than the rows in the primary table, and the physical order is the required order, reducing disk I/O, so the query effort can be significantly reduced.

Note: Changes to the primary table are not reflected when the staging table is created. When data is frequently modified in the primary table, be careful not to lose data.

From

Http://www.111cn.net/database/110/5d6bc167016ed54ef3de149d89fee7c4.htm

Copyright NOTICE: Welcome to reprint, hope to reprint the same time add the original address, thank you for your cooperation, learning happy!

MySQL detailed query optimization of-----------massive database

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.