When the number of records in a single
MySQL table is too large, the performance of adding, deleting, modifying and checking will drop sharply, so we will provide some optimization references in this article. You can refer to the following steps to optimize:
One, single table optimization
Unless the single-table data will continue to rise in the future, do not consider splitting at the beginning. Splitting will bring various complexity of logic, deployment, and operation and maintenance. Generally, tables with integer values below tens of millions, and tables dominated by strings below 5 million are not too problematic. In fact, in many cases, the performance of
MySQL single tables still has a lot of room for optimization. It can even support data volume of tens of millions or more.
1.1 Field
Try to use TINYINT, SMALLINT, MEDIUM_INT as integer types instead of INT. If it is non-negative, add UNSIGNED;
The length of VARCHAR only allocates the space really needed;
Use enumeration or integer instead of string type;
Try to use TIMESTAMP instead of DATETIME;
Do not have too many fields in a single table, it is recommended to be within 20;
Avoid using NULL fields, it is difficult to query optimization and occupy additional index space;
Use integer to store IP.
1.2 Index
Indexes are not as many as possible. They should be created in a targeted manner according to the query. Consider creating indexes on the columns involved in the WHERE and ORDER BY commands. You can check whether indexes are used or full table scans according to EXPLAIN;
Try to avoid the NULL value judgment of the field in the WHERE clause, otherwise it will cause the engine to give up using the index and perform a full table scan;
Fields with sparse value distribution are not suitable for indexing, such as "gender" fields with only two or three values;
Only prefix index is built for character fields;
Character fields are best not to be the primary key;
No foreign keys are required, and are bound by program guarantees;
Try not to use UNIQUE and be bound by program guarantees;
When using multi-column indexes, the order of ideas and query conditions are consistent, and unnecessary single-column indexes are deleted.
1.3 Query SQL
You can find out the slower SQL by opening the slow query log;
Do not perform column calculations: SELECT id WHERE age + 1 = 10, any operation on the column will cause a table scan, which includes database tutorial functions, calculation expressions, etc., when querying, move the operation to the right of the equal sign as much as possible;
The sql statement is as simple as possible: one sql can only be operated on one cpu; the large statement splits the small statement to reduce the lock time; one large sql can block the entire library;
Without SELECT *;
OR is rewritten as IN: the efficiency of OR is n level, the efficiency of IN is log(n) level, and the number of IN is recommended to be controlled within 200;
Without functions and triggers, implemented in the application;
Avoid %xxx query;
Use JOIN less;
Use the same type for comparison, such as the ratio of '123' and '123', and the ratio of 123 and 123;
Try to avoid using the != or <> operator in the WHERE clause, otherwise the engine will give up using the index and perform a full table scan;
For continuous values, use BETWEEN without IN: SELECT id FROM t WHERE num BETWEEN 1 AND 5;
Do not take the entire table for the list data, use LIMIT to page, and the number of pages should not be too large.
1.4 Engine
Currently, two engines, MyISAM and InnoDB, are widely used:
MyISAM
MyISAM engine is the default engine of MySQL 5.1 and earlier versions. Its characteristics are:
Row locks are not supported. All tables that need to be read are locked when reading, and exclusive locks are added to the table when writing;
Does not support affairs;
Does not support foreign keys;
Does not support safe recovery after a crash;
While the table has read queries, it supports inserting new records into the table;
Support the first 500 character index of BLOB and TEXT, support full-text index;
Supports delayed index update, which greatly improves write performance;
For tables that will not be modified, compressed tables are supported, which greatly reduces disk space usage.
InnoDB
InnoDB became the default index after MySQL 5.5. Its characteristics are:
Support row lock, use MVCC to support high concurrency;
Support affairs;
Support foreign keys;
Support safe recovery after crash;
Full-text indexing is not supported.
ps: It is said that InnoDB already supports full-text indexing in MySQL 5.6.4
Generally speaking, MyISAM is suitable for SELECT-intensive tables, while InnoDB is suitable for INSERT and UPDATE-intensive tables.
1.5 System tuning parameters
You can use the following tools for benchmarking:
sysbench: A modular, cross-platform and multi-threaded performance testing tool.
https://github.com/akopytov/sysbench
iibench-mysql: Java-based MySQL / Percona / MariaDB index insert performance test tool.
https://github.com/tmcallaghan/iibench-mysql
tpcc-mysql: TPC-C test tool developed by Percona.
https://github.com/Percona-Lab/tpcc-mysql
There are many specific tuning parameters. For details, please refer to official documents. Here are some more important parameters:
back_log: The back_log value can indicate how many requests can be stored on the stack in a short period of time before MySQL temporarily stops answering new requests. In other words, if the MySQL connection data reaches max_connections, the new request will be stored in the stack to wait for a connection to release resources. The number of the stack is back_log. If the number of waiting connections exceeds back_log, it will not be Grant connection resources. Can be increased from the default 50 to 500.
wait_timeout: database connection idle time, idle connection will occupy memory resources. It can be reduced from the default 8 hours to half an hour.
max_user_connection: The maximum number of connections, the default is 0 without an upper limit, it is best to set a reasonable upper limit.
thread_concurrency: The number of concurrent threads, set to twice the number of CPU cores.
skip_name_resolve: Prohibit DNS resolution for external connections and eliminate DNS resolution time, but all remote hosts need to use IP access.
key_buffer_size: The cache size of the index block. An increase will increase the index processing speed and has the greatest impact on the performance of MyISAM tables. For the memory of about 4G, it can be set to 256M or 384M. By querying show status like'key_read%', it is best to ensure that key_reads / key_read_requests are below 0.1%.
innodb_buffer_pool_size: Cache data blocks and index blocks, which have the greatest impact on the performance of InnoDB tables. By querying show status like'Innodb_buffer_pool_read%', ensure (Innodb_buffer_pool_read_requests – Innodb_buffer_pool_reads) / Innodb_buffer_pool_read_requests the higher the better.
innodb_additional_mem_pool_size: InnoDB storage engine is used to store data dictionary information and the memory space size of some internal data structures. When there are many database objects, adjust the size of this parameter appropriately to ensure that all data can be stored in memory to improve access efficiency. When it is too small, MySQL will record Warning information in the error log of the database, and then you need to adjust the size of this parameter.
innodb_log_buffer_size: The buffer used by the transaction log of the InnoDB storage engine. Generally speaking, it is not recommended to exceed 32MB.
query_cache_size: Cache ResultSet in MySQL, which is the result set of a SQL statement execution, so it can only be used for select statements. When there is any change in the data of a table, the cached data in the Query Cache of all select statements that reference the table will become invalid. Therefore, when our data changes very frequently, using Query Cache may outweigh the gains. Adjust according to the hit rate (Qcache_hits/(Qcache_hits+Qcache_inserts)*100)). Generally, it is not recommended to be too large. 256MB may be almost the same. Large configuration static data can be adjusted appropriately. You can use the command show status like'Qcache_%' to view the current system Query catch usage size.
read_buffer_size: MySQL reads the buffer size. A request for a sequential scan of the table will allocate a read buffer, and MySQL will allocate a memory buffer for it. If the sequential scan requests for the table are very frequent, you can improve its performance by increasing the value of this variable and the size of the memory buffer.
sort_buffer_size: The buffer size used by MySQL to perform sorting. If you want to increase the speed of ORDER BY, first see if you can make MySQL use indexes instead of additional sorting stages. If not, you can try to increase the size of the sort_buffer_size variable.
read_rnd_buffer_size: MySQL random read buffer size. When rows are read in any order (for example, in sort order), a random read buffer will be allocated. When performing a sorting query, MySQL will scan the buffer first to avoid disk search and increase the query speed. If you need to sort a large amount of data, you can increase the value appropriately. However, MySQL will allocate this buffer space for each client connection, so this value should be set appropriately to avoid excessive memory overhead.
record_buffer: Each thread that performs a sequential scan allocates a buffer of this size for each table it scans. If you do a lot of sequential scans, you may want to increase this value.
thread_cache_size: Save threads that are not currently associated with a connection but are ready to serve new connections later, and can quickly respond to thread requests for connections without creating new ones.
table_cache: similar to thread_cache _size, but used to cache table files, has little effect on InnoDB, and is mainly used for MyISAM.
1.6 Upgrade hardware
Scale up, not much to say, according to whether MySQL is CPU-intensive or I/O-intensive, the performance of MySQL can be significantly improved by increasing CPU and memory, and using SSD.