MySQL optimization (4)

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Document directory

7.2.11 how to avoid full table Scan
7.2.15 other optimization ideas

7.2.9 how to optimize MySQL ORDER BY

In some cases, MySQL can directly use indexes to meetORDER BYOrGROUP BYClause without additional sorting.

AlthoughORDER BYThe index is not exactly matched with the order of the index, but can still be used, as long as the unused index part and all the additionalORDER BYField inWhereClause is included. The following queries are all indexed.ORDER BYOrGROUP BYPart:

Select * FROM t1 orDER BY key_part1,key_part2,... ;Select * FROM t1 Where key_part1=constant orDER BY key_part2;Select * FROM t1 Where key_part1=constant GROUP BY key_part2;Select * FROM t1 orDER BY key_part1 DESC, key_part2 DESC;Select * FROM t1Where key_part1=1 orDER BY key_part1 DESC, key_part2 DESC;

In other cases, MySQL cannot use indexes to meet the requirements.ORDER BYAlthough it uses indexes to find records for matchingWhereClause. These situations are as follows:

Perform different index keysORDER BY:
```
Select * FROM t1 orDER BY key1, key2;
```

On the non-consecutive index keyORDER BY:

Select * FROM t1 Where key2=constant orDER BY key_part2;

Used at the same timeASCAndDESC:

Select * FROM t1 orDER BY key_part1 DESC, key_part2 ASC;

Index key used for searching recordsORDER BYIs not the same:
```
Select * FROM t1 Where key2=constant orDER BY key1;
```
Many tables are connected together, And the read records areORDER BYAll fields in are from the first very number table (that is, inEXPLAINThe connection type of the first table in the analysis result is notconst).
DifferentORDER BYAndGROUP BYExpression.
Records in Table indexes are not stored in order. For example,HASHAndHEAPThe table is like this.

RunEXPLAIN Select ... orDER BYTo see if MySQL uses an index in the query. IfExtraThe field value isUsing filesortMySQL cannot use the index. For details, see "7.2.1EXPLAINSyntax (Get Information AboutSelect)".

When the results must be sorted, MySQL 4.1 used the followingfilesortAlgorithm:

Reads records based on the index key or scans data tables. Those that cannot matchWhereAll clause records are skipped.
Each record in the buffer uses a pair to store two values (index key and record pointer ). The buffer size depends on system variables.sort_buffer_sizeDepends on the value.
When the buffer is slow, run qsort and store the results in temporary files. Save the stored block pointer (if all the 'to' values can be saved in the buffer, you do not need to create a temporary file ).
Execute the above operation until all the records are read.
Perform multiple merge operations to merge upMERGEBUFF(7) blocks in one region are stored in another temporary file. Repeat this operation until all the blocks in the first file are placed in the second file.
Repeat the preceding operations until the remaining number of blocks is smallerMERGEBUFF2(15 ).
In the last multi-merge, only the record pointer (the last part of the sort index key) is written to the result file.
Read records in sequence by reading the records pointer in the result file. To optimize this operation, MySQL puts the record pointer read into a large block and uses it to read records in sequence and store the records in the buffer. The buffer size is determined by system variables.read_rnd_buffer_sizeDepends on the value. The code for this step is in the source file'SQL/records. CC'.

One problem with this approximation algorithm is that the database reads two records: one is estimation.WhereThe second clause is the sorting time. Although the record is successfully read for the first time (for example, a full table scan), the second is random reading (the index key has been sorted, but the record does not exist ).

In MySQL 4.1 and later versions,filesortThe optimization algorithm is used to record not only the index key value and record location, but also the fields required in the query. This avoids the need to read records twice. ImprovedfilesortThe algorithm is roughly as follows:

As before, read matchingWhereClause records.
Each record records a corresponding; 'tuples 'information, including the index key value, record location, and all fields required in the query.
Sort the 'tuples 'based on the index key.
Read records in sequence, but read records from the sorted 'tuples 'list instead of from the data table.

Use the improvedfilesortCompared with the original algorithm, the 'tuples 'take longer space than the 'right'. They are rarely suitable for sorting buffering (the buffer size is determinedsort_buffer_size). Therefore, more I/O operations may be required, resulting in slower improved algorithms. To avoid slowing down, this optimization method is only used to sort the total size of additional fields in the 'tuples 'beyond the system variables.max_length_for_sort_data(The value of this variable is set too high, which means High Disk Load and low CPU load ).

To improveORDER BYThe speed depends on whether MySQL can use indexes instead of additional sorting processes. If you cannot use an index, follow these policies:

Addsort_buffer_size.
Addread_rnd_buffer_size.
ModifytmpdirTo point to a dedicated File System with a lot of free space. If you use MySQL 4.1 or update, this option allows multiple paths to use the loop format. Each path is separated by a colon (':') on UNIX, and a semicolon (';') on Windows, Netware, and OS/2 (';'). This feature can be used to evenly distribute loads to several directories. Note: These paths must be directories distributed on different physical disks, rather than different directories on the same physical disk.

By default, MySQLGROUP BY col1, col2, ...Sort the query andORDER BY col1, col2, ...The query is the same. If it explicitly contains a list with the same fieldORDER BYMySQL will not lose the speed when optimizing it, Because sorting will always happen. If a query containsGROUP BYTo avoid overhead of sorting results, you can useORDER BY NULLTo cancel sorting. For example:

Insert INTO fooSelect a, COUNT(*) FROM bar GROUP BY a orDER BY NULL;

7.2.10 how to optimize MySQL LIMIT

In some casesLIMIT row_countBut not usedHAVING
The query will be processed differently:

If you only useLIMITMySQL sometimes uses indexes, but more often it is a full table scan.
IfLIMIT row_countAndORDER BYMySQL isRow_countThe sorting will stop, rather than sorting the entire table.
WhenLIMIT row_countAndDISTINCTMySQL findsRow_countA unique record is no longer searched.
In some cases,GROUP BYYou can read the index key in sequence (or sort the index key) and calculate the accumulated information until the index key changes. In this case,LIMIT row_countIt does not calculate any non-essentialGROUP BYValue.
Once MySQL sends a full number of request records to the client, it suspends the query unlessSQL_CALC_FOUND_ROWS.
LIMIT 0 Always Returns an empty result set. This is useful for checking the type of the query or result field.
When the server uses a temporary table to process queriesLIMIT row_countIt can be used to calculate the amount of space required.

7.2.11 how to avoid full table Scan

If MySQL needs to perform a full table scan to process the queryEXPLAINIn the resulttypeThe field value isALL. MySQL performs a full table scan under the following conditions:

The data table is too small. A full table scan is much faster than a search for the index key. This is usually done when the total number of records in a table is less than 10 and the record length is relatively short.
No applicableONOrWhereThe index field of the clause.
By comparing index fields with constant values, MySQL has computed (based on the Index Tree) to a constant that covers a large part of the data table. Therefore, full table scanning should be faster. For details, see "7.2.4 how MySQL OptimizesWhere".
Other fields use an index key with a small base (many records match the index key value. In this case, MySQL considers that using index keys requires a large number of searches, which is not as fast as full table scan.

For small tables, full table scan is usually more appropriate. However, for large tables, try the following techniques to avoid the full table scan incorrectly selected by the optimizer:

RunANALYZE TABLE tbl_nameUpdates the index key distribution of the table to be scanned. For details, see "14.5.2.1ANALYZE TABLESyntax ".
UseFORCE INDEXTell MySQL that full table scan will waste more resources than using the given index. For details, see "14.1.7SelectSyntax ".
```
Select * FROM t1, t2 FORCE INDEX (index_for_column)    Where t1.col_name=t2.col_name;    
```
StartmysqldUse Parameters--max-seeks-for-key=1000Or executeSET max_seeks_for_key=1000To tell the optimization program that all indexes will not lead to more than 1000 index searches. See section 5.2.3 Server System Variables ".

7.2.12 AccelerationInsert

The time taken to insert a record is determined by the following factors, and the following figures roughly indicate the proportion of the impact:

Connection: (3)
Send query to server: (2)
Resolution query: (2)
Insert record: (1 x record size)
Insert index: (1 x number of indexes)
Close: (1)

The overhead of opening the data table during initialization is not considered here, because each query is performed only once.

If it is a B-tree index, as the number of indexes increases, the speed of inserting records decreases in the proportion of log N.

You can use the following methods to increase the insert speed:

You can useInsertMultiple statements are attached.VALUESValue. This approach is better than using a single valueInsertThe statement is much faster (in some cases, it is faster ). If you want to add a record to a non-empty data table, you can adjust the variablebulk_insert_buffer_sizeTo make it faster. For more information, see "5.2.3 Server System Variables ".
If you want to insert a large number of records from different clients, useInsert DELAYEDStatement can also increase the speed. For details, see "14.1.4InsertSyntax ".
PairMyISAMYou canSelectInsert a record when the statement is running, as long as the record is not deleted at this time.
To load a text file to a data table, useLOAD DATA INFILE. This is usually a large numberInsert20 times of the statement. For details, see "14.1.5LOAD DATA INFILESyntax ".
Through some additional work, it is possibleLOAD DATA INFILEIt runs faster when a data table has a large number of indexes. The procedure is as follows:
1. UseCreate TABLECreate a table.
2. RunFLUSH TABLESStatement ormysqladmin flush-tablesCommand.
3. Runmyisamchk --keys-used=0 -rq /path/to/db/tbl_nameCommand to delete all indexes of a data table.
4. RunLOAD DATA INFILE, Data is inserted into the table, because the table index does not need to be updated, so this will be very fast.
5. RunmyisampackMake the data table smaller. For more information, see "15.1.3.3 compressed table characteristics ".
6. Runmyisamchk -r -q /path/to/db/tbl_nameRe-create the index. The created index tree is saved in memory before being written to the disk, which saves the disk search speed. After reconstruction, the index tree distribution is very balanced.
7. RunFLUSH TABLESStatement ormysqladmin flush-tablesCommand.
Note,LOAD DATA INFILEWhen you insert data into an empty table, it is also optimized. The main difference is that: RunmyisamchkMore temporary memory will be allocated for index creation and executionLOAD DATA INFILEThe command allows the database server to allocate memory for re-indexing. Starting from MySQL 4.0, you can runAlter TABLE tbl_name DISABLE KEYSTo replacemyisamchk --keys-used=0 -rq /path/to/db/tbl_name, RunAlter TABLE tbl_name ENABLE KEYSReplacemyisamchk -r -q /path/to/db/tbl_name. This can be omitted.FLUSH TABLESStep.
Several statements can be executed together after the table is locked to accelerateInsertOperation:
```
LOCK TABLES a WRITE;    Insert INTO a VALUES (1,23),(2,34),(4,33);    Insert INTO a VALUES (8,26),(6,29);    UNLOCK TABLES;    
```
This improves performance by: until allInsertAfter the statement is complete, the index cache is refreshed to the disk at a time. Usually, there are several timesInsertStatement will incur the overhead of refreshing the index cache to the disk most times. If multiple values can be inserted in one statement at a time, the displayed lock table operation is unnecessary. For the transaction table, useBEGIN/COMMITReplaceLOCK TABLESTo speed up. The lock table also reduces the total time for multiple connection tests, even though the maximum waiting time for each independent connection to wait for the lock increases. For example:
```
Connection 1 does 1000 inserts    Connections 2, 3, and 4 do 1 insert    Connection 5 does 1000 inserts    
```
If no lock table exists, connection 2, 3, 4 will be completed before 1, 5. If the table is locked, the join 2, 3, 4 may be completed after 1, 5, but the total time may only need 40%. MySQLInsert,Update,DeleteOperations are very fast, but if there are more than five inserts or updates in a statement, it is best to lock them for better performance. If you want to insert multiple inserts at a time, it is best to add the values before and after each loop (about 1000 times ).LOCK TABLESAndUNLOCK TABLESSo that other processes can access the data table. The performance is still good.InsertAlways RatioLOAD DATA INFILEData insertion is slow because the implementation policies of the two are different.
ToMyISAMThe table is faster.LOAD DATA INFILEAndInsertSystem variables can be added.key_buffer_sizeFor more information, see "7.5.2 Tuning Server Parameters ".

7.2.13 AccelerationUpdate

UpdateStatement optimization andSelectBut it has additional write overhead. The write overhead depends on the number of records to be updated and the number of indexes. If the index does not change, no update is required.

Another way to increase the update speed is to delay the update and put many updates together. If the table is locked, it is much faster to perform multiple updates at the same time than to update the tables separately.

Note:MyISAMIf the dynamic record format is used in the table, the record may be split after it is updated to a longer value. If you do this frequently, do it once in a whileOPTIMIZE TABLEIt is very important. For details, see "14.5.2.5 optimize table Syntax ".

7.2.14 Acceleration Delete

The time for deleting a single record is almost proportional to the number of indexes. To delete records more quickly, you can increase the index key cache. For details, see "7.5.2 Tuning Server Parameters ".

To delete all records of a data table, useTRUNCATE TABLE tbl_nameInsteadDelete FROM tbl_name. For details, see "14.1.9 truncate Syntax ".

7.2.15 other optimization ideas

This section lists some other ideas for improving the query processing speed:

Use permanent connection to the database to avoid connection overhead. If you need to initialize many connections and cannot use permanent connections, you can modify the variables.thread_cache_sizeFor more information, see "7.5.2 Tuning Server Parameters ".
Always check whether the query uses the existing indexes in the table. In MySQL, you can useEXPLAINStatement. For details, see "7.2.1 EXPLAIN Syntax (Get Information About a Select )".
Do not update frequently.MyISAMTables are too complexSelectStatement to avoid contention for locks between reading and writing.
InMyISAMIf no records are being deleted, you can insert records while reading data in other queries. If this is very important, you should try to use the table when the table does not delete records. Another possible solution is to delete a large number of records and then executeOPTIMIZE TABLEStatement.
If you always need to followexpr1, expr2, ...UseAlter TABLE ... orDER BY expr1, expr2, ...Modify a table. After you use this method to expand and modify tables, you may obtain higher performance.
In some cases, the field type is''Hashed''Based on other fields. If this field is short and basically a unique value, it may be faster than using a large index on several fields. This additional field can be used very easily, as follows:
```
Select * FROM tbl_name Where hash_col=MD5(CONCAT(col1,col2))    AND col1='constant' AND col2='constant';
```
IfMyISAMTables are frequently modified in large quantities, so avoid modifying all variable-length fields as much as possible (VARCHAR, BLOB，TEXT). Although the table has only one variable-length field, it uses the dynamic record format. For details, see "15 MySQL Storage Engines and Table Types ".
Generally, when the data table record changes''Big''Then, it is not very useful to split the table into several different tables. The maximum performance of accessing a record is that the first byte of the record is found during disk search. After finding the record location, most of the current disks can quickly read records for most applications. SetMyISAMWhen a table is split into multiple unique columns, the fields in the dynamic format in the data table can be modified to a fixed size record, or the table needs to be scanned frequently, however, you do not need to read most of the fields. For details, see "15 MySQL Storage Engines and Table Types ".
If you need to frequently perform statistics on a table based on many fields, it is better to create a new table to store the statistics of these real-time updates. Updates like the following will be very fast:
```
Update tbl_name SET count_col=count_col+1 Where key_col=constant;
```
If you only need table-level locks (multiple reads/one write), useMyISAMThe storage engine is very important, suchMyISAMAndISAMTable. This also boasts good performance in many databases, because row-Level Lock management programs are basically useless in this case.
To collect statistics from a large log table, you can use the abstract table instead of scanning the entire log table. Maintain abstract table comparison''Real-time''. When things change (such as business decision-making), re-creating the abstract table is much faster than modifying the applications in the operation.
If possible, it is best to use a classification report.''Real-time''Or''Statistic''The data required for the report only needs to come from the summary table, and the information of the summary table is generated periodically from the real-time data.
It should be recognized that the field has a default value. You must specify the values when they are different from the default values. This eliminates the need for MySQL to increase the insert speed.
In some cases, data is assembled and stored inBLOBType fields are more convenient. Add additional commands to assemble and disassemble the application.BLOBValue in the field, but this can save a lot of storage overhead in some cases. This does not require data compliance.Record-and-fieldFormat table structure is very practical.
Generally, all redundant data (in the database principle, it is called the "third paradigm") should be stored "). However, it is cost-effective to copy some information or create a summary table for higher efficiency.
Stored Procedure orUDFs(User-Defined Functions) may have higher performance when executing some tasks. However, when the database does not support these features, there are other alternative methods to achieve the goal, even if they are a little slow.
You can obtain results from the query cache or response, and then put many insert and update operations together. If the database supports table locks (such as MySQL and Oracle), this ensures that the index cache only needs to be refreshed once after all update operations.
You can useInsert DELAYED. This will increase the speed because multiple records are simultaneously written to the disk.
When you wantSelectThe statement has a higher priority than the insert operation.Insert LOW_PRIORITY.
UseSelect HIGH_PRIORITYTo skip the queue, that is, even if other clients are writing dataSelectThe execution is complete.
InInsertMultiple record insertion formats are used in statements (supported by many databases ).
UseLOAD DATA INFILETo import a large amount of data.InsertFast.
UseAUTO_INCREMENTField to generate a unique value.
Regular executionOPTIMIZE TABLEPrevents dynamic record formatMyISAMTable fragments. For details, see "15.1.3MyISAMTable storage formats ".
UseHEAPTable, which may increase the speed. For details, see "15.1.3MyISAMTable storage formats ".
In normal Web server configuration, it is best to store image files as files, and only store the file index information in the database. The reason for this is that, in general, the web server always performs better on the File Cache than the database, so using file storage will make the system easier and faster.
For infrequently accessed data that is not very important, it can be stored in a memory table. For example, when the Web Client cannot save cookies, it is used to save the title of the last display.
Fields with the same values in different tables should be declared as of the same type. Before MySQL 3.23, table join operations are slow. Make the field name as simple as possible. For examplecustomerInnameTo replacecustomer_nameAs the field name. To make the field name portable in other database systems, it should be within 18 characters.
If you need real high speed, we recommend that you look at the differences between the underlying data storage interfaces supported by various database servers. For exampleMyISAMThe storage engine is 2-5 times faster than other SQL interfaces. This requires that the data must be on the same server as the application, and it is usually accessed by only one process (because the external file lock is indeed slow ). Using only one process can eliminate the introduction of the underlyingMyISAMThe problem caused by the Command (this is easy to get higher performance, if needed ). Because of the careful design of database interfaces, it is easy to support this optimization method.
If you use digital data, in many cases you want to access the database (using online connections) faster than using text files. Because the digital information is more compact than the text files stored in the database, only less disk searches are required for access. In addition, you can save code in the application because you do not need to parse text files to find the corresponding rows and fields.
Database Replication has performance benefits for some operations. The client can obtain data from multiple replication servers, which can allocate the load. To avoid slowing down the master server when backing up data, you can also store the backup on the slave server. For details, see "6 Replication in MySQL ".
DefinitionMyISAMOption added when tableDELAY_KEY_WRITE=1In this way, the new index will be updated faster, because the disk will be refreshed only when the data table is closed. However, the server may be killed when the data table is opened. You can use the Parameter--myisam-recoverTo ensure data security or run before database restart.myisamchkCommand (however, in this case, useDELAY_KEY_WRITEIn this case, nothing will be lost, because indexes can always be regenerated from data ).

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL optimization (4)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support