Overview MySQL statistics and mysql statistics
MySQL executes SQL statements through SQL parsing and query optimization. The parser splits the SQL statements into data structures and passes them to subsequent steps, the query optimizer finds the best solution for executing SQL queries and generates execution plans. The query optimizer determines how SQL is executed and depends on the database statistics. Next we will introduce innodb statistics in MySQL 5.7.
MySQL statistics are stored in two types: non-persistent and persistent statistics.
I. Non-persistent statistics
Non-persistent statistics are stored in the memory. If the database is restarted, statistics will be lost. There are two ways to set non-persistent statistics:
1. Global variables, INNODB_STATS_PERSISTENT = OFF |
2 CREATE/alter table parameters, STATS_PERSISTENT = 0 |
Non-persistent statistics are automatically updated in the following situations:
1. Execute ANALYZE TABLE |
2 In the case of innodb_stats_on_metadata = ON, execute show table status and show index to query TABLES and STATISTICS under INFORMATION_SCHEMA. |
3. Use mysql client to log on when the -- auto-rehash function is enabled |
4. The table is opened for the first time. |
5. 1/16 of the table data is modified from the last statistical update. |
The disadvantage of non-persistent statistics is obvious. If a large number of tables start to update statistics after the database is restarted, it will have a great impact on the instance. Therefore, persistent statistics are used currently.
Ii. Persistent statistics
Starting from 5.6.6, MySQL uses persistent statistics by default, that is, INNODB_STATS_PERSISTENT = ON. Persistent statistics are stored in the mysql. innodb_table_stats and mysql. innodb_index_stats tables.
Persistent statistics are automatically updated in the following situations:
1 INNODB_STATS_AUTO_RECALC = ON In this case, 10% of the data in the table is modified. |
2. Add new indexes |
Innodb_table_stats is the table statistics and innodb_index_stats is the index statistics. The meanings of each field are as follows:
Innodb_table_stats |
Database_name |
Database Name |
Table_name |
Table Name |
Last_update |
Last Update Time of statistics |
N_rows |
Number of rows in the table |
Clustered_index_size |
Number of clustered index pages |
Sum_of_other_index_sizes |
Number of pages with other indexes |
Innodb_index_stats |
Database_name |
Database Name |
Table_name |
Table Name |
Index_name |
Index name |
Last_update |
Last Update Time of statistics |
Stat_name |
Statistics name |
Stat_value |
Statistical Value |
Sample_size |
Sample Size |
Stat_description |
Type description |
To better understand innodb_index_stats, create a test table:
CREATE TABLE t1 ( a INT, b INT, c INT, d INT, e INT, f INT, PRIMARY KEY (a, b), KEY i1 (c, d), UNIQUE KEY i2uniq (e, f)) ENGINE=INNODB;
Write Data as follows:
To view the statistical information of table t1, pay attention to the stat_name and stat_value fields.
When tat_name = size: stat_value indicates the number of indexed pages
Stat_name = n_leaf_pages: stat_value indicates the number of leaf nodes.
When stat_name = n_diff_pfxNN: stat_value indicates the number of unique values in the index field. Here, we will describe the number of unique values:
1. n_diff_pfx01 indicates the number after distinct in the first column of the index. For example, if column a of PRIMARY has only one value of 1, so index_name = 'Primary' and stat_name = 'n' _ diff_pfx01, stat_value = 1.
2. n_diff_pfx02 indicates the number after distinct in the first two columns of the index, for example, the e and f columns of i2uniq have four values, so when index_name = 'i2uniq' and stat_name = 'n' _ diff_pfx02 ', stat_value = 4.
3. For non-unique indexes, a primary key index is added after the original column, for example, index_name = 'i1' and stat_name = 'n' _ diff_pfx03 '. In the original index column c, d. the distinct result of the primary key column a, (c, d, a) is 2.
After understanding the specific meanings of stat_name and stat_value, we can help you troubleshoot why the appropriate index is not used during SQL Execution. For example, the stat_value of an index n_diff_pfxNN is much smaller than the actual value, the query optimizer considers that the index selection is poor, which may lead to incorrect indexing.
3. Inaccurate statistics Processing
We checked the execution plan and found that the correct index was not used. If the statistics difference in innodb_index_stats is large, we can handle it in the following ways:
1. manually update the statistics. Note that the read lock will be applied during execution:
ANALYZETABLE TABLE_NAME;
2. If the updated statistical information is still inaccurate, you can add a data page for table sampling. You can modify the following two methods:
A) The global variable INNODB_STATS_PERSISTENT_SAMPLE_PAGES. The default value is 20;
B) a single table can specify the sampling of this table:
Alter table TABLE_NAME STATS_SAMPLE_PAGES = 40;
According to the test, the maximum value of STATS_SAMPLE_PAGES is 65535. If the value exceeds the maximum value, an error is returned.
Currently, MySQL does not provide the histogram function. In some cases (such as uneven data distribution), updating statistics may not necessarily obtain an accurate execution plan. You can only specify indexes using the index hint method. The histogram function will be added in the new version 8.0. Let's look forward to more and more powerful MySQL functions!