2 Index and Query Optimization
Index type
Ø normal index: This is the most basic index type and has no restrictions such as uniqueness.
Ø unique index: It is basically the same as a normal index, but all index column values are unique.
Ø primary key: The primary key is a unique index, but must be specified as "primary key ".
Ø full-text indexing: MYSQL supports full-text indexing and full-text retrieval from 3.23.23. In MYSQL, the full-text index type is FULLTEXT. Full-TEXT indexes can be created on VARCHAR or TEXT columns.
Most MySQL indexes (primary key, UNIQUE, INDEX, and FULLTEXT) are stored in the B tree. The R-tree is used for spatial column indexes, and the MEMORY table supports hash indexes.
Single-Column index and multi-column index (Composite Index)
An index can be a single-column index or multiple-column index. Using indexes on related columns is one of the best ways to improve the performance of SELECT operations.
Multi-column index:
MySQL can create indexes for multiple columns. An index can contain 15 columns. For some column types, You Can index the column's left prefix. The column order is very important.
Multi-column indexes can be considered as an array that contains the sorting of values created by connecting the values of the index column. In general, even if it is the most restrictive single-column index, its capacity is far lower than that of multiple-column indexes.
Leftmost prefix
Multiple-column indexes have a feature, that is, the Leftmost prefix (Leftmost Prefixing ). If one index of multiple columns is key (firstname lastname age), MySQL uses this index when the search condition is the combination and sequence of the following columns:
Firstname, lastname, age
Firstname, lastname
Firstname
That is, key (firstname lastname) and key (firstname) are also created ).
Indexes are mainly used for the following operations:
Quickly find the rows matching a WHERE clause.
Delete rows. When a join is executed, the row is retrieved from other tables.
Locate the MAX () or MIN () value for the specific indexed column key_col. Optimized by the pre-processor to check whether the WHERE key_part _ # = constant is used for all keyword elements before key_col in the index. In this case, MySQL performs a keyword search for each MIN () or MAX () expression and replaces it with a constant. If all expressions are replaced with constants, the query returns immediately. For example:
Select min (key2), MAX (key2) FROM tb WHERE key1 = 10;
If the prefixes at the leftmost of an available keyword are sorted or grouped (for example, order by key_part_1, key_part_2), sort or group a table. If DESC is followed by all keyword elements, the keywords are read in reverse order.
In some cases, you can optimize a query so that you can retrieve values without querying data rows. If a query uses only the numeric type of a table and the leftmost prefix of some keywords, you can retrieve the value from the index tree for faster speed.
SELECT key_part3 FROM tb WHERE key_part1 = 1
Sometimes MySQL does not use indexes, even if there are available indexes. One scenario is when the optimizer estimates that using indexes will require MySQL to access most rows in the table. (In this case, table scanning may be faster ). However, if this type of query uses LIMIT to search for only some rows, MySQL uses an index because it can locate several rows faster and return results. For example:
Reasonable index creationSuggestions:
(1) smaller data types are generally better: smaller data types usually require less space in the disk, memory, and CPU cache for faster processing.
(2) A simple data type is better: the processing overhead of integer data is smaller than that of characters, because the strings are more complex. In MySQL, the built-in Date and Time data types should be used instead of strings to store the time, and the IP addresses of integer data types should be used to store the time.
(3) Avoid NULL as much as possible: the column should be specified as not null unless you want to store NULL. In MySQL, it is difficult to query and optimize columns with null values, because they make the index and index statistics and comparison operations more complex. You should replace null values with 0, a special value, or an empty string.
This part is some trivial suggestions and notes for indexing and writing SQL statements.
1.This parameter is used when the result set has only one row of data.LIMIT 1
2.AvoidSELECT *Always specify the columns you need
The more data you read from a table, the slower the query. It increases the disk operation time, or when the database server and the WEB server are independent. You will experience a very long network latency, just because the data is not required to be transmitted between servers.
3.Use connection (JOIN) To replace subqueries.(Sub-Queries)
JOIN... it is more efficient because MySQL does not need to create a temporary table in the memory to perform the query in two steps.
4.UseENUM,CHARInsteadVARCHAR, Use reasonable field attribute length
5.Try to useNOT NULL
6.Tables with a fixed length are faster.
7.Split largeDELETEOrINSERTStatement
8.The smaller the queried column, the faster it is.
Where condition
In a query, the WHERE condition is also an important factor. It is very important to set the where condition as few as possible and as reasonable as possible.When there are multiple conditions, put the condition that will extract as little data as possible before, reduce the query time of the next where condition.
Some where conditions will invalidate the index:
The query conditions of the where clause include! =, MySQL will not be able to use indexes.
Ø when the where clause uses the Mysql function, the index will be invalid. For example, select * from tb where left (name, 4) = 'xxx'
Ø when LIKE is used for search matchingThe index is valid: select * from tbl1 where name like 'xxx % ', and the index is invalid when like' % xxx %'
Iii. Configuration Optimization
After installing MySQL, configure the file my. cnf in the/MySQL installation directory/share/mysql directory, the directory also contains multiple configuration files for reference, there are my-large.cnf, my-huge.cnf, my-medium.cnf, my-small.cnf, corresponding to the configurations of large and small database applications. The. ini file exists in the MySQL installation directory in the Windows environment.
The main variables that have a great impact on performance optimization are listed below, mainly including the connection request variables and buffer variables.
1.Connection Request variable:
1) max_connections
The maximum number of connections of MySQL, increasing the number of file descriptors required by mysqld. If the number of concurrent connection requests on the server is large, we recommend that you increase this value to increase the number of parallel connections. Of course, this is based on the support of the machine, because the more connections, when MySQL provides a connection buffer for each connection, the memory overhead will increase. Therefore, you must adjust the value appropriately and do not increase the value blindly.
If the value is Too small, the ERROR 1040: Too connections may often occur. You can use the 'conn % 'wildcard to check the number of connections in the current status to determine the value.
Show variables like 'max _ connections' max connections
Number of connections returned by show status like 'max _ used_connections'
As follows:
Mysql> show variables like 'max _ connections ';
+ ----------------------- + ------- +
| Variable_name | Value |
+ ----------------------- + ------- +
| Max_connections | 256 |
+ ----------------------- + ------- +
Mysql> show status like 'max % connections ';
+ ----------------------- + ------- +
| Variable_name | Value |
+ ---------------------------- + ------- +
| Max_used_connections | 256 |
+ ---------------------------- + ------- +
Max_used_connections/max_connections * 100% (Ideal Value: ≈ 85%)
If max_used_connections is the same as max_connections, the value of max_connections is too low or exceeds the server load limit. If the value is lower than 10%, the value is too large.
2) back_log
The number of connections that can be saved by MySQL. This takes effect when the main MySQL thread receives many connection requests in a short time. If the connection data of MySQL reaches max_connections, the new request will be stored in the stack to wait for a connection to release resources. The number of stacks is back_log, if the number of connections waiting exceeds back_log, connection resources are not granted.
The back_log value indicates how many requests can be stored in the stack in a short time before MySQL temporarily stops answering new requests. Only if you want to have many connections in a short period of time, you need to increase it. In other words, this value is the size of the listener queue for the incoming TCP/IP connection.
When you observe the process list of your host (mysql> show full processlist), a large number of 264084 | unauthenticated user | xxx. xxx. xxx. xxx | NULL | Connect | NULL | login | when a NULL process is to be connected, increase the value of back_log.
The default value is 50, and the value can be adjusted to 128. For Linux systems, the value range is an integer smaller than 512.
3) interactive_timeout
The number of seconds that an interactive connection is waiting for action before the server closes. An interactive customer is defined as a customer who uses the CLIENT_INTERACTIVE option for mysql_real_connect.
The default value is 28800, Which is 7200.
2.Buffer variable
Global Buffer:
4) key_buffer_size
Key_buffer_size specifies the size of the index buffer, which determines the index processing speed, especially the index reading speed. Check the status values Key_read_requests and Key_reads to check whether the key_buffer_size setting is reasonable. The ratio of key_reads/key_read_requests should be as low as possible, at least and (the above STATUS values can be obtained using show status like 'key _ read % ).
Key_buffer_size only applies to the MyISAM table. This value is used even if you do not use the MyISAM table, but the internal temporary disk table is a MyISAM table. You can use the check status value created_tmp_disk_tables to learn the details.
Example:
Mysql> show variables like 'key _ buffer_size ';
+ ------------------- + ------------ +
| Variable_name | Value |
+ --------------------- + ------------ +
| Key_buffer_size | 536870912 |
+ ------------ ---------- + ------------ +
Key_buffer_size is 512 MB. Let's look at the usage of key_buffer_size:
Mysql> show global status like 'key _ read % ';
+ ------------------------ + ------------- +
| Variable_name | Value |
+ ------------------------ + ------------- +
| Key_read_requests | 27813678764 |
| Key_reads | 6798830 |
+ ------------------------ + ------------- +
There are a total of 27813678764 index read requests, 6798830 of which are not found in the memory to directly read the index from the hard disk, calculate the probability that the index does not hit the cache:
Key_cache_miss_rate = Key_reads/Key_read_requests * 100%. It is better to set it to around 1/1000.
The default value is 8388600 (8 M). The host has 4 GB of memory and can be adjusted to 268435456 (256 MB ).
5) query_cache_size
Using the Query Buffer, MySQL stores the query results in the buffer. In the future, the results of the same SELECT statement (case sensitive) will be directly read from the buffer.
Check the STATUS value Qcache _ * to check whether the query_cache_size setting is reasonable (the preceding STATUS value can be obtained using show status like 'qcache % ). If the Qcache_lowmem_prunes value is very large, it indicates that the buffer is often insufficient. If the Qcache_hits value is also very large, it indicates that the query buffer is frequently used. In this case, you need to increase the buffer size; if the Qcache_hits value is small, it indicates that your query repetition rate is very low. In this case, the Query Buffer will affect the efficiency, so you can consider not to use the query buffer. In addition, adding SQL _NO_CACHE to the SELECT statement explicitly indicates that no Query Buffer is used.
Parameters related to query buffering include query_cache_type, query_cache_limit, and query_cache_min_res_unit.
Query_cache_type specifies whether to use the Query Buffer. It can be set to 0, 1, and 2. This variable is a SESSION-level variable.
Query_cache_limit specifies the buffer size that can be used by a single query. The default value is 1 MB.
Query_cache_min_res_unit is introduced after version 4.1. It specifies the minimum unit for allocating the buffer space. The default value is 4 K. Check the status value Qcache_free_blocks. If the value is very large, it indicates that there are many fragments in the buffer. This indicates that the query results are relatively small. In this case, reduce query_cache_min_res_unit.
Example:
Mysql> show global status like 'qcache % ';
+ ------------------------------- + ----------------- +
| Variable_name | Value |
+ ------------------------------- + ----------------- +
| Qcache_free_blocks | 1, 22756 |
| Qcache_free_memory | 76764704 |
| Qcache_hits | 213028692 |
| Qcache_inserts | 208894227 |
| Qcache_lowmem_prunes | 4010916 |
| Qcache_not_cached | 13385031 |
| Qcache_queries_in_cache | 43560 |
| Qcache_total_blocks | 1, 111212 |
+ ------------------------------- + ----------------- +
Mysql> show variables like 'query _ cache % ';
+ -------------------------------------- + -------------- +
| Variable_name | Value |
+ -------------------------------------- + ----------- +
| Query_cache_limit | 2097152 |
| Query_cache_min_res_unit | 4096 |
| Query_cache_size | 203423744 |
| Query_cache_type | ON |
| Query_cache_wlock_invalidate | OFF |
+ -------------------------------------- + --------------- +
Query cache fragmentation rate = Qcache_free_blocks/Qcache_total_blocks * 100%
If the query cache fragmentation rate exceeds 20%, you can use flush query cache to sort out the CACHE fragmentation, or try to reduce query_cache_min_res_unit, if your QUERY is a small amount of data.
Query cache utilization = (query_cache_size-Qcache_free_memory)/query_cache_size * 100%
If the query cache utilization is below 25%, the query_cache_size setting is too large and can be appropriately reduced. If the query cache utilization is above 80% and Qcache_lowmem_prunes> 50, the query_cache_size may be small, or too many fragments.
Query cache hit rate = (Qcache_hits-Qcache_inserts)/Qcache_hits * 100%
The cache fragmentation rate of the sample server is 20.46%, the query cache utilization is 62.26%, the query cache hit rate is 1.94%, And the hit rate is very low. It is possible that write operations are frequent and there may be some fragments.
Buffer for each connection
6) record_buffer_size
Each thread that performs an ordered scan allocates a buffer of this size to each table it scans. If you perform many sequential scans, you may want to increase the value.
The default value is 131072 (128 K), which can be changed to 16773120 (16 M)
7) read_rnd_buffer_size
Random read buffer size. When a row is read in any order (for example, in the sorting order), a random read cache is allocated. During sorting query, MySQL first scans the buffer to avoid disk search and increase the query speed. If you need to sort a large amount of data, you can increase the value accordingly. However, MySQL will issue this buffer space for each client connection. Therefore, set this value as much as possible to avoid excessive memory overhead.
Generally, it can be set to 16 Mb.
8) sort_buffer_size
Each thread that needs to be sorted allocates a buffer of this size. Add this value to accelerate the order by or group by operation.
The default value is 2097144 (2 M), which can be changed to 16777208 (16 M ).
9) join_buffer_size
The buffer size that can be used by the Joint query operation.
Record_buffer_size, Read_rnd_buffer_size, sort_buffer_size, and join_buffer_size are exclusive to each thread. That is to say, if 100 threads are connected, the occupied space is 16 MB * 100.
10) table_cache
The cache size of a table. When MySQL accesses a table, if there is space in the table buffer, the table is opened and put into it, so that the table content can be accessed more quickly.Check the status value of the peak timeOpen_tablesAndOpened_tables, You can decide whether to addTable_cache.If you find that open_tables is equal to table_cache and opened_tables is growing, you need to increase the value of table_cache (the preceding STATUS values can be obtained using show status like 'open % tables ). Note that you cannot blindly set table_cache to a large value. If it is set too high, the file descriptor may be insufficient, resulting in unstable performance or connection failure.
1 GB memory machine. The recommended value is 128-256. A server with around 4 GB memory can be set to 384 Mb or MB.
11) max_heap_table_size
Memory table size that can be created by the user. This value is used to calculate the maximum row value of the memory table. This variable supports dynamic changes, that is, set @ max_heap_table_size = #
This variable and tmp_table_size limit the size of the internal memory table. If the size of an internal heap (stacked) Table exceeds tmp_table_size, MySQL can automatically change the heap table in memory to a hard disk-based MyISAM table as needed.
12) tmp_table_size
You can set the tmp_table_size option to increase the size of a temporary table, for example, a temporary table generated BY the advanced group by operation. If you increase this value, MySQL will also increase the size of the heap table, which can improve the speed of join queries,We recommend that you optimize the query as much as possible. Make sure that the temporary tables generated during the query process are in the memory, so that the temporary tables are too large to generate hard disk-based MyISAM tables..
Mysql> show global status like 'created _ tmp % ';
+ -------------------------------- + --------- +
| Variable_name | Value |
+ ---------------------------------- + --------- +
| Created_tmp_disk_tables | 21197 |
| Created_tmp_files | 58 |
| Created_tmp_tables | 1, 1771587 |
+ -------------------------------- + ----------- +
Created_tmp_tables increases each time a temporary table is created. If the size of a temporary table exceeds tmp_table_size, a temporary table is created on the disk. Created_tmp_disk_tables increases, and Created_tmp_files indicates the number of temporary files created by the MySQL service, ideal Configuration:
Created_tmp_disk_tables/Created_tmp_tables * 100% <= 25%. For example, the above server Created_tmp_disk_tables/Created_tmp_tables * 100% = 1.20% should be quite good.
The default value is 16 Mb. It can be adjusted to-. The thread is exclusive. The memory is too large and may be insufficient for I/O congestion.
13) thread_cache_size
The number of threads that can be reused. If yes, the new thread is obtained from the cache. If there is space when the connection is disconnected, the customer's thread is placed in the cache. If there are many new threads, this variable value can be used to improve performance.
By comparing variables in Connections and Threads_created states, you can see the role of this variable.
The default value is 110, and the optimization value is 80.
14) thread_concurrency
We recommend that you set it to 2 times the number of server CPU cores. For example, for a dual-core CPU, the value of thread_concurrency should be 4; for two dual-core CPUs, the value of thread_concurrency should be 8. The default value is 8.
15) wait_timeout
Specify the maximum connection time of a request. For servers with around 4 GB of memory, you can set it to 5-10.
3.Configure several InnoDB Variables
Innodb_buffer_pool_size
For InnoDB tables, innodb_buffer_pool_size serves the same purpose as key_buffer_size for MyISAM tables. InnoDB uses this parameter to specify the memory size to buffer data and indexes. For a separate MySQL database server, you can set this value to 80% of the physical memory.
According to the MySQL manual, for 2 GB memory machines, the recommended value is 1G (50% ).
Innodb_flush_log_at_trx_commit
It mainly controls the time when innodb writes data in the log buffer to the log file and flush the disk. The values are 0, 1, and 2. 0 indicates that when a transaction is committed, no log write operation is performed, but the data in log buffer is written to the log file every second and flushed the disk once. 1, the commit of every second or every transaction will cause the log file write and flush disk operations, ensuring the ACID of the transaction; set to 2, each transaction commit will cause the write action of the log file, however, the flush disk operation is completed every second.
In actual tests, it is found that this value has a great impact on the speed of data insertion. If it is set to 2, it takes only 2 seconds to insert 10000 records, and if it is set to 0, it only takes 1 second, it takes 229 seconds to set it to 1. Therefore, we recommend that you merge the insert operation into a transaction as much as possible in the MySQL manual, which can greatly increase the speed.
According to the MySQL manual, this value can be set to 0 or 2 if the risk of losing the most recent transaction is allowed.
Innodb_log_buffer_size
Log cache size, generally 1-8 M, the default value is 1 M. For large transactions, you can increase the cache size.
It can be set to 4 M or 8 M.
Innodb_additional_mem_pool_size
This parameter specifies the memory pool size that InnoDB uses to store data dictionaries and other internal data structures. The default value is 1 MB. Generally, it doesn't need to be too large. It only needs to be enough. It should be related to the complexity of the table structure. If not, MySQL will write a warning message in the error log.
According to the MySQL manual, for 2 GB memory machines, the recommended value is 20 MB, which can be increased as appropriate.
Innodb_thread_concurrency = 8
It is recommended to set it to 2 * (NumCPUs + NumDisks). The default value is generally 8.
Author: asyty