Database MySQL Optimizer principle

Source: Internet
Author: User
Tags mysql query time 0

The MySQL query optimizer has several goals, but the primary goal is to use the index as much as possible, and use the most restrictive indexes to eliminate as many rows of data as possible.

Your ultimate goal is to submit a SELECT statement to find rows of data instead of excluding rows of data. The reason the optimizer tries to exclude rows of data is that the faster it excludes rows of data, the faster it is to find rows that match the criteria.

How to better use the index:

1: Try to compare data columns with the same data type. Use columns of the same data type when you use indexed data columns in a comparison operation. The same data types have a higher performance than the different types.

For example, int is different from bigint. char (10) is considered to be char (10) or varchar (10), but differs from char (12) or varchar (12). If you compare data columns of different types, you can use ALTER TABLE to modify one of them to match their type.

MySQL optimization, including table data type selection, SQL statement optimization, system Configuration and maintenance optimization

1. table data type selection

(1) Small can be used small. Table data Type the first principle is to use the shortest type that correctly represents and stores data. This can reduce the use of disk space, memory, and CPU cache.

(2) Avoid using NULL, this is also the most on-line optimization technology Bo messaging. The reason is that additional bytes are added, as well as the complexity of indexing, index statistics, and values. Many also ignore a

Count (column), Count (column) is the number of rows that do not count as null for column values. For more information on NULL, refer to: http://www.phpben.com/?post=71

(3) How do strings choose char and varchar? General Phper can think of is char is fixed size, varchar can dynamically store data. Here's a look at the difference between the two:

Property

Char

Varchar

Domain Size

The maximum number of characters is 255 (not bytes), regardless of the encoding, exceeding this value automatically intercepts 255 characters to save without an error.

65,535 bytes, starting two bits of storage length, more than 255 characters, with 2 bits of storage length, otherwise 1 bits, the specific character length according to the coding to determine, such as UTF8

The maximum character is 21,845

How to handle whitespace at the end of a string

Remove the trailing blanks and compare them automatically when you compare them.

version<=4.1, the space at the end of the string is deleted, and version>5.0 remains

Storage space

Fixed space, analogy char (10) a space of 10 characters regardless of whether the string has 10 characters

varchar save space, but the update may change, if varchar (10), start if the storage of 5 characters, when the update to 7 when there is MyISAM may take the line apart, InnoDB may be paged, so the overhead increases

Applicable occasions

Suitable for storing very short or fixed or long-similar characters such as MD5 encrypted password char (33), nickname Char (8), etc.

When the maximum length is greater than the average length and the update occurs.

Note that when you have some English or data, it is best to use fewer bytes per character, such as Latin1

(4) Integral type, shaping priority principle

Tinyint, smallint, mediumint, int, bigint, respectively, need 8, 16, 24, 32, 64.

Range: -2^ (n-1) ~ 2^ (n-1)-1

Many programmers are accustomed to using int when designing a data sheet, regardless of the problem.

The author suggests: can use tinyint's never use smallint

myth : Int (1) and int (11) are the same, the only difference is how many bits are displayed when the MySQL client is displayed.

Int (2) unsigned = 2 is 2
Int (2) unsigned zerofil = deposit 2 is 02

Shaping priority principle: can be used to reshape the use of other types of replacement, such as IP can be converted into plastic preservation, such as commodity price ' 50.00 yuan ' is saved to 50

(5) Precision and space conversion. When you store data of the same numeric range, the floating-point number type typically uses less space than the decimal type. The float field uses 4-byte storage

Data. A double type requires 8 bytes and has a higher precision and a larger range of values, and data of type decimal is converted to a double type.

2. SQL Statement Optimization

  1. mysql> Create table test (
  2. ID smallint (Ten) not null auto_increment primary key,
  3. Username char (8) not null,
  4. Password char (4) not null,
  5. ' Level' tinyint (1) default 0,
  6. Last_login Char (a) is not null,
  7. Index (username,password,last_login)) Engine=innodb;

This is the test table, where the ID is the primary key, multi-column index (Username,password,last_login), which has more than 10,000 data.

(1) leftmost prefix principle

definition : The leftmost prefix principle refers to the sequence of columns that appear in some conditions or expressions in the SQL WHERE clause to remain consistent with multiple indexes or in multiple-column index order, as long as there is a non-sequential occurrence and the fault cannot be exploited to a multicolumn index.

Example: a multicolumn index (username,password,last_login) is given, with three columns appearing in where (Username,password,last_login), (username, Password), (username) to use the index, such as the following sequence (Password,last_login), (Passwrod), (Last_login)-These three do not start from username, (username,last _login)-Faults, less password, are not available to the index.

Because B+tree multi-column indexes are saved in the order in which they are created, the index is retrieved in this order when the indexes are retrieved

Test : The following tests are imprecise, and this is just an indication of how the index can be used correctly in accordance with the leftmost prefix principle. There are some of the following test time 0.00sec can not see what time difference, because the amount of data only 20,003, plus not on the physical machine, many unpredictable factors are not considered. When it comes to large data volumes, high concurrency, the leftmost prefix principle is undeniable to improve performance.

Ps: The leftmost prefix principle where the WHERE clause has or appears or traverses the entire table

(1.1) can use the index correctly

L WHERE clause expression order is (username)

  1. mysql> explain select * from one where username=' abgvwfnt ';
  2. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ——-+--+ ————-+
  3. | ID | Select_type | Table | type | possible_keys | Key | key_len | ref | Rows | Extra |
  4. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ——-+--+ ————-+
  5. | 1 | Simple | One | Ref | Username | Username | 24 | Const |5 | Using where |
  6. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ——-+--+ ————-+
  7. 1 row in Set (0.00 sec)

L WHERE clause expression order is (Username,password)

  1. mysql> explain select * from one where username=' abgvwfnt ' and password=' 123456′;
  2. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ————-+--+ ————-+
  3. | ID | Select_type | Table | type | possible_keys | Key | key_len | ref | Rows | Extra |
  4. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ————-+--+ ————-+
  5. | 1 | Simple | One | Ref | Username | Username | 43 |    Const,const | 1 | Using where |
  6. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ————-+--+ ————-+
  7. 1 row in Set (0.00 sec)

L WHERE clause expression order is (Username,password, Last_login)

  1. mysql> explain select * from one where username=' abgvwfnt ' and password=' 123456′ andlast_login=' 1338251170′;
  2. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ——————-+--+ ————-+
  3. | ID | Select_type | Table | type | possible_keys | Key | key_len | ref| Rows | Extra |
  4. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ——————-+--+ ————-+
  5. | 1 | Simple | One | Ref | Username | Username | 83 |    Const,const,const | 1 | Using where|
  6. +--+ ————-+ ——-+--+ ————— + ———-+ ——— + ——————-+--+ ————-+
  7. 1 row in Set (0.00 sec)

It can be seen that type=ref is a multi-column index, Key_len is 24, 43, 83, which indicates that the index used is (username), (Username,password), (Username,password, Last_login Row is 5, 1, 1, and the rows are retrieved very rarely, because these three queries are indexed by the index prefix principle.

(2) Order by optimization

(3) Isolation column: If id+1=2 = 1

(4) or,in,andUnionall, you can try the Union All

(4.1) or will traverse the table even if there is an index

(4.2) For in, this is controversial, many of the online optimization program mentioned as little as possible in, this is not comprehensive, in fact, in the inside if it is a constant, can be a bold in

(4.3) UNION all returns and sets directly, which avoids the overhead of de-heavy. It says "Try" to refine the SQL statement with union ALL instead of or, because this is not always optimized, just as a way to try

(5) Index selectivity

Index selectivity is the ratio of non-repeating index values to the number of data rows in the cardinality (cardinality) table, index selectivity = cardinality/data rows, and cardinality can be viewed by "Show index from table name".

The advantage of high index selectivity is that when MySQL finds matches, it can filter more rows, the selectivity of a unique index is the best, and the value is 1.

The data content for non-unique indexes or columns to be indexed is very long, so select the index prefix. Here's a quick explanation:

    1. mysql> Select count (distinct (username))/count (*) from one;
    2. +———————————— +
    3. | count (distinct (username))/count (*) |
    4. +———————————— +
    5. | 0.2047 |
    6. +———————————— +
    7. 1 row in set (0.09 sec)

The count (distinct (username))/count (*) is the index selectivity value, where 0.2 is too small.

  

(6) Duplicate or redundant index

Many phper start to think that the index is relatively multi-point performance is better, did not consider some indexes are duplicated, such as building a (username), (Username,password), (Username,password,last_login), It is clear that the first index is duplicated, since both are capable of satisfying their functions.

There is a sense that a minimum index is built to meet the functional requirements. For the index of the InnoDB engine, each time the data is modified, the primary key index is changed, and the corresponding index value in the secondary index is modified, which can lead to a large number of data migrations, paging, and fragmentation.

3 , system configuration and maintenance optimization

(1) some important variables

L KEY_BUFFER_SIZE index block buffer size, for the MyISAM storage engine, the higher the value, the better the performance. But exceeding the maximum value that the operating system can tolerate will make MySQL unstable. --This is a very important parameter.

L sort_buffer_size This is the index in the sort buffer size, if the sort data size exceeds this value, then create temporary file, note and myisam_sort_buffer_size difference--This is a very important parameter

L Read_rnd_buffer_size When the row is read in sorted order, the rows are read through the buffer to avoid searching the hard disk. Setting the variable to a larger value can greatly improve the performance of the order by. However, this is the buffer allocated for each client, so you should not set the global variable to a larger value. Instead, change session variables only for clients that need to run large queries

L Join_buffer_size Cache size for inter-table Association (join)

L tmp_table_size size of cache table

L Table_cache allows MySQL to open the maximum number of tables, and these are cache in memory

L Delay_key_write defer updating the index for the MyISAM storage engine. This means that when the update record is updated, the data is first up to disk, but not up, the index is present in memory, and when the table is closed, the memory index is written to disk

More Parameters View Http://www.phpben.com/?post=70

(2) optimize,Analyze,check,repair Maintenance Operations

L optimize data in the INSERT, UPDATE, delete inevitably some data migration, paging, then there are some fragments, over time debris accumulation affect performance, which requires the DBA to periodically optimize the database to reduce fragmentation, which is through the optimize command.

For MyISAM table operation: Optimize table name

For the InnoDB table is not supported optimize operation, otherwise prompt "table does not the support optimize, doing recreate + analyze instead", of course, can also be through the command: Alter Table one Type=innodb; To replace.

L analyze is used to analyze and store the distribution of the key words of the table, so that the system obtains accurate statistics and affects the generation of SQL execution plan. It is not necessary to perform regular table analysis for tables where the data has not changed substantially. However, if the data volume of the table changes significantly, and the user feels that the actual execution plan differs from the expected execution plan, performing a table analysis may help produce the expected execution plan.

Analyze Table Name

L check to see if there are errors in the table or view, which is useful for MyISAM and InnoDB storage engine tables. Keyword statistics are also updated for table checking of tables in the MyISAM storage engine

L Repair optimize need to have enough hard disk space, otherwise it may destroy the table, resulting in the inability to operate, it is necessary to use the Repair, note InnoDB does not support Repair operation

The above actions appear as follows this is the check

    1. +———-+ ——-+ ———— –+ ————-+
    2. | Table | Op | msg_type| Msg_text |
    3. +———-+ ——-+ ———— –+ ————-+
    4. | Test.one | Check | status | OK |
    5. +———-+ ——-+ ———— –+ ————-+

Where OP is option can be repair check optimize,msg_type represents the type of information, Msg_text represents the type of information, where the status of the table is normal. If you use Repair on the InnoDB table, you will see a note | The storage engine for the table doesn ' t support repair

Note: The above operation is best when the database access is the lowest time to operate, because it involves many table lock, scan, data migration and other operations, otherwise it may cause some functions will not work or even the database crashes.

(3) Update and maintenance of table structure

L Change the table structure. This is a tricky issue when you want to use ALTER to change the table structure in a data table with data volume Tens. One way is to change the table with the usual alter at low concurrency and low traffic. Another is to build another with the table to be modified, this table in addition to modify the structure properties of the other and the original table is identical, so that you can get a corresponding. frm file, and then use flush with read lock lock Read, Then overwrite the. frm that overwrites the original table with the new. frm file, and finally unlock table.

L Create a new index. The general method does not say here.

1, create a table without index, import data formation. MyD file.

2, the creation includes the index B table, forms. frm and. myi files

3. Lock read/write

4, put the B table. frm and. myi files changed to a table name

5. Unlocking

6. Create an index with repair.

This method is also very effective for large tables. That's why many DBAs insist that "the pilot database is indexed so that it is more efficient"

• Check MySQL server regularly

Check the database periodically using the show status, show Processlist, and more. Here does not elaborate, this said is also the space is relatively big, the author to this also not very understanding

Part IV: Illustrated MySQL Query Execution Process

1, query the cache, determine whether the SQL statement exactly match, and then determine whether there is permission, two judgment is false to the parser parsing statements, for the true extraction of data results returned to the user.

2, parser parsing. Parser first lexical analysis, syntax analysis, check errors such as quotation marks have not closed, and then generate a parse tree.

3, pretreatment. Preprocessing solves the semantics of unresolved parsers, such as checking the existence of tables and columns, whether aliases are incorrect, and generating a new parse tree.

4, optimizer to do a lot of optimization operations.

5, generate the execution plan.

6, the query execution engine, responsible for the scheduling engine to obtain the corresponding data

7, return the result.

Reprint Address: http://ourmysql.com/archives/1171

Http://blog.chinaunix.net/uid-25311424-id-3957863.html

Principles of the database MySQL optimizer

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.