MySQL database optimization (Li Zhenliang) _ MySQL from the O & M perspective

Source: Internet
Author: User
Tags server memory database sharding
A mature database architecture is not designed with high availability, high scaling, and other features at the beginning. it is gradually improved with the increase in the number of users. This blog post focuses on the problems and optimization solutions faced by the MySQL database development cycle. a mature database architecture does not have high availability, high scalability, and other features in the initial design, it gradually improves the infrastructure as the number of users increases. This blog post focuses on the problems and optimization solutions faced by the MySQL database development cycle. Aside from the front-end applications, it is roughly divided into the following five stages:

1. database table design

After the project is established, the development department develops the project according to the requirements of the product department. one part of the work of the development engineers is to design the table structure. This is very important for databases. Improper design will directly affect access speed and user experience. There are many influencing factors, such as slow queries, inefficient query statements, improper indexing, and database congestion (deadlock. Of course, a team of test engineers will perform stress tests and find bugs. For teams without test engineers, most development engineers do not consider whether the database design is reasonable at the initial stage, but complete function implementation and delivery as soon as possible. after a certain amount of access to the project, the hidden problems will be exposed, and it is not so easy to modify them.

2. Database deployment

As the O & M engineer appeared, the initial access volume of the project was not very large, so a single deployment was sufficient to cope with QPS (Query rate per second) at around 1500 ). Considering the high availability, you can use MySQL master-slave replication + Keepalived for double-click hot backup. common cluster software include Keepalived and Heartbeat.

Dual-machine hot standby blog: http://lizhenliang.blog.bitsCN.com/7876557/1362313

3. database performance optimization

If you deploy MySQL on a common X86 server, the MySQL theoretical value can handle about 2000 QPS without any optimization. After optimization, it may increase to around 2500 QPS. otherwise, when the access volume reaches about 1500 concurrent connections, the database processing performance will slow down and the hardware resources will be rich, in this case, we should consider software issues. So how can we maximize the performance of the database? On the one hand, you can run multiple MySQL instances at a time to maximize the server performance. on the other hand, you can optimize the database. the default operating system and database configuration are usually conservative, the database may be limited to some extent. you can adjust these configurations to process as many connections as possible.

The specific optimization has the following three levels:

3.1 database configuration optimization

MySQL commonly uses two storage engines: MyISAM, which does not support transaction processing, fast read performance, and table-level locks. The other is InnoDB, which supports ACID and is designed to maximize performance and lock rows for processing large data volumes.

Table lock: it has low overhead, large lock granularity, high deadlock probability, and low concurrency.
Row lock: large overhead, small lock granularity, low deadlock probability, and high concurrency.
Why are table locks and row locks? The main purpose is to ensure data integrity. for example, if a user is operating on a table and other users want to operate on this table, they must wait for the first user to complete the operation before other users can perform the operation, table locks and row locks serve this purpose. Otherwise, when multiple users operate on a table at the same time, data may conflict or be abnormal.

Based on the above, using the InnoDB storage engine is the best choice, and is also the default storage engine in MySQL 5.5. Each storage engine has many associated parameters. The following lists the parameters that affect the database performance.

Default value of public parameters:

Max_connections = 151 # process the maximum number of connections at the same time. we recommend that you set the maximum number of connections to about 80% of the maximum number of connections. sort_buffer_size = 2 M # The buffer size for sorting. this parameter only applies to order by and group, this value can be increased to 16Mquery_cache_limit = 1 M # Query cache limit. the query results will be cached only after 1 M, in this case, the cache pool will not overwrite query_cache_size = 16 M # view the buffer size. it is used to cache the SELECT query results. The next SELECT query will return results directly from the cache pool, you can multiply the value open_files_limit = 1024 # limit on the number of opened files. if the value displayed by show global status like 'open _ Files' is equal to or greater than the value of open_files_limit, the program cannot connect to the database or get stuck.

Default value of MyISAM parameter:

Key_buffer_size = 16 M # Size of the index cache area. Generally, the size of the physical memory is 30-40% read_buffer_size = 128 K # Size of the read operation buffer. we recommend that you set 16 M or 32 M.

Default InnoDB parameters:

Innodb_buffer_pool_size = 128 M # Index and data buffer size. Generally, the physical memory is set to 60%-70% innodb_buffer_pool_instances = 1 # Number of buffer pool instances, we recommend that you set four or eight innodb_flush_log_at_trx_commit = 1 # key parameters. 0 indicates that logs are written to and synchronized to the disk every second. if the database fails, transaction data is lost for about 1 second. 1. after each SQL statement is executed, it is written to the log and synchronized to the disk. the I/O overhead is high. after the SQL statement is executed, it is not efficient to wait for the log to read and write. 2 indicates that only logs are written to the system cache and synchronized to the disk every second, which is highly efficient. if the server fails, transaction data will be lost. We recommend setting 2, which does not require high data security. it has high performance and obvious effect after modification. Innodb_file_per_table = OFF # The default value is shared tablespace. The size of the idbdata file in the shared tablespace keeps increasing, which affects the I/O performance. We recommend that you enable the standalone tablespace mode. the indexes and data of each table are stored in its own standalone tablespace, so that a single table can be moved across different databases. Innodb_log_buffer_size = 8 M # log buffer size. generally, no more than 16 M is required because the log is refreshed once every second.

3.2 system kernel optimization

Most MySQL instances are deployed on linux, so some operating system parameters also affect MySQL Performance. the linux kernel is optimized as follows.

Net. ipv4.tcp _ fin_timeout = 30 # TIME_WAIT timeout time. the default value is 60snet. ipv4.tcp _ tw_reuse = 1 #1 indicates that reuse is enabled, TIME_WAIT socket is allowed to be re-used for new TCP connections, and 0 indicates that net is disabled. ipv4.tcp _ tw_recycle = 1 #1 indicates enabling TIME_WAIT socket fast recovery, and 0 indicates disabling net. ipv4.tcp _ max_tw_buckets = 4096 # The system maintains the maximum number of TIME_WAIT sockets. if this number is exceeded, the system will randomly clear some TIME_WAIT and print the warning information. net. ipv4.tcp _ max_syn_backlog = 4096 # enter the maximum SYN queue length. increasing the queue length can accommodate more waiting connections.

In linux, if the number of file handles opened by a process exceeds the system's default value of 1024, the "too program files open" message is displayed. Therefore, you need to adjust the file handle limit.

# Vi/etc/security/limits. conf # add the following configuration. * indicates all users. you can also specify a user to restart the system. * soft nofile 65535 * hard nofile 65535 # ulimit-SHn 65535 # takes effect immediately.

3.3 hardware configuration

Increase physical memory and improve file system performance. The Linux kernel allocates a cache area (system cache and data cache) from the memory to store hot data. through the file system delay writing mechanism, when the conditions are met (for example, the cache size reaches a certain percentage or the sync command is executed), it is synchronized to the disk. That is to say, the larger the physical memory, the larger the allocated cache area, the more cached data. Of course, some cached data will be lost due to server faults.

The SSD hard disk replaces the SAS hard disk. The RAID level is adjusted to RAID 1 + 0, which has better read/write performance (IOPS) than RAID 1 and RAID 5. after all, the database pressure mainly comes from disk I/O.

4. database architecture scaling

As the business volume grows, the performance of a single database server is no longer able to meet business needs. you should add machines to the cluster ~~~. The main idea is to break down the load of a single database, break through the disk I/O performance, store hot data in the cache, and reduce the disk I/O access frequency.

4.1 master-slave replication and read/write splitting

Because most databases in the production environment are read operations, a master-slave multi-slave architecture is deployed. the master database is responsible for write operations and double-click hot backup. multiple slave databases are used for load balancing, read operations. mainstream load balancers include LVS, HAProxy, and Nginx.

How can I implement read/write splitting? Most enterprises implement read/write separation at the code level, with high efficiency. Another method is to implement read/write splitting through a Proxy program, with fewer applications in the enterprise. common proxies include MySQL Proxy and Amoeba. In this database cluster architecture, the high concurrency of databases is greatly increased to solve the performance bottleneck of a single database. If one slave database can process 2000 QPS, five slave databases can process QPS, and the horizontal scalability of the database is also very easy.

Sometimes, in the face of a large number of write operations, a single write can not meet the business needs. If dual-master is used, database data inconsistency may occur. this is because different users in the application may operate on the two databases, simultaneous update operations cause conflicts or inconsistencies between the two database databases. In a single database, MySQL uses the storage engine mechanism table lock and row lock to ensure data integrity. how can this problem be solved in multiple master databases? A Master-slave replication management tool developed based on the perl language, called MySQL-MMM (Master-Master replication managerfor Mysql, Mysql Master replication Manager ), the biggest advantage of this tool is that it only provides write operations for one database at a time, effectively ensuring data consistency.

Master-slave replication blog: http://lizhenliang.blog.bitsCN.com/7876557/1290431

Read/write splitting blog: http://lizhenliang.blog.bitsCN.com/7876557/1305083

MySQL-MMM blog: http://lizhenliang.blog.bitsCN.com/7876557/1354576

4.2 Add cache

Add a cache system for the database and cache hot data to the memory. if there is data in the cache that requires requests, the database will no longer return results, improving read performance. The cache implements local cache and distributed cache. The local cache caches data to the local server memory or files. The distributed cache can cache massive amounts of data with good scalability. the mainstream distributed cache systems include memcached, redis, and memcached with stable performance. The data is cached in the memory, and the speed is very fast. the QPS can reach about. If you want to ensure data persistence, use redis, and the performance is no lower than that of memcached.

Working process:

4.3 Sub-databases

Database Sharding refers to splitting related tables into different databases based on different services, such as databases such as web, bbs, and blog. If the business volume is large, you can also take the split database as the master from the architecture, further avoiding the pressure on a single database.

4.4 Sub-tables

With the increase in data volume, a table in the database contains millions of data records, resulting in a long query and insertion time. how can we solve the pressure on a single table? You should consider whether to split the table into multiple small tables to reduce the pressure on a single table and improve the processing efficiency. this method is called table sharding.

Table sharding technology is troublesome. to modify the SQL statements in the program code, you must manually create other tables. you can also use the merge storage engine to implement table sharding, which is much simpler. After table sharding, the program operates on a summary table. This table does not store data, but only has some table sharding relationships and Data Update methods. the summary table will be queried based on different queries, assigning pressure to different small tables improves the concurrency and disk I/O performance.

Table Sharding is divided into vertical sharding and horizontal sharding:

Vertical Split: splits a table with many fields into multiple tables to solve the table width problem. You can place infrequently used fields in a single table, separate large fields in a table, or put closely related fields in a table.

Horizontal split: split the original table into multiple tables. Each table has the same structure, solving the problem of large data volume in a single table.

4.5 partitions

A partition divides the data of a table into multiple blocks based on the fields in the table structure (such as range, list, and hash). These blocks can be stored on one disk, you can also partition a table on a different disk, but the data is hashed in multiple locations. in this way, multiple hard disks simultaneously process different requests, this improves the disk I/O read/write performance and is easy to implement.

Note: adding cache, Database Sharding, table sharding, and partition are mainly implemented by programmers.

5. database maintenance

Database maintenance is the main task of O & M engineers or DBAs, including performance monitoring, performance analysis, performance tuning, database backup and recovery.

5.1 Key Performance Indicators

QPS, Queries Per Second: Queries Per Second, the number of Queries that a database can process Per Second

TPS, Transactions Per Second: number of Transactions processed Per Second

You can view the running status through show status. there are more than 300 status information records. several values help us calculate the QPS and TPS, as shown below:

Uptime: the actual number of seconds that the server is running.

Questions: number of queries sent to the database

Com_select: number of queries, which are used to operate the database

Com_insert: Number of inserts

Com_delete: number of deletions

Com_update: number of updates

Com_commit: Number of Transactions

Com_rollback: Number of rollbacks

Then, the calculation method is used to calculate the QPS based on Questions:

 mysql> show global status like 'Questions'; mysql> show global status like 'Uptime';

QPS = Questions/Uptime

Calculate TPS based on Com_commit and Com_rollback:

 mysql> show global status like 'Com_commit'; mysql> show global status like 'Com_rollback'; mysql> show global status like 'Uptime';

TPS = (Com_commit + Com_rollback)/Uptime

Another calculation method: calculate QPS based on Com_select, Com_insert, Com_delete, and Com_update

 mysql> show global status where Variable_name in('com_select','com_insert','com_delete','com_update');

Wait 1 second before execution, get the interval difference, the second time each variable value minus the first corresponding variable value, is QPS

TPS calculation method:

 mysql> show global status where Variable_name in('com_insert','com_delete','com_update');

Calculating TPS is not a query operation. just calculate the four values of insert, delete, and update.

According to tests on the two calculation methods, Questions is more accurate when there are many myisam tables in the database. When there are many innodb tables in the database, the computation with Com _ * is more accurate.

5.2 enable slow query log

MySQL enables slow query logs and analyzes which SQL statement is slow. you can use set to set the variable and restart the service to make it invalid. you can add the parameter in my. cnf to take effect permanently.

Mysql> set global slow-query-log = on # enable slow query mysql> set global slow_query_log_file = '/var/log/mysql/mysql-slow.log '; # specify the location of the slow query log file mysql> set global log_queries_not_using_indexes = on; # record the query mysql> set global long_query_time = 1; # only record the slow query with processing time greater than 1 S

To analyze slow query logs, you can use the mysqldumpslow tool provided by MySQL to analyze the logs.

# Mysqldumpslow-t 3/var/log/mysql/mysql-slow.log # view the slowest first three queries

Percona's pt-query-digest tool can also be used to analyze slow log, binlog, and general log with comprehensive log analysis functions.

Analyze slow query logs: pt-query-digest/var/log/mysql/mysql-slow.log

Analyze binlog log: mysqlbinlog mysql-bin.000001> mysql-bin.000001. SQL

Pt-query-digest -- type = binlog mysql-bin.000001. SQL

Analyze common logs: pt-query-digest -- type = genlog localhost. log

5.3 Database Backup

Backing up a database is the most basic task and the most important task. Otherwise, the consequences will be very serious. you know! However, because the database is large and hundreds of GB of data is often time-consuming, you should select an efficient backup policy. for databases with large data volumes, incremental backup is generally used. Common backup tools include mysqldump, mysqlhotcopy, and xtrabackup. mysqldump is suitable for small databases. because it is a logical backup, it takes a long time to back up and restore data. Mysqlhotcopy and xtrabackup are physical backups. backup and recovery are fast and hot copies are performed without affecting database services. we recommend that you use xtrabackup to support incremental backup.

Xtrabackup backup tools use blog: http://lizhenliang.blog.bitsCN.com/7876557/1612800

5.4 database repair

Sometimes the MySQL server suddenly loses power and shuts down abnormally, causing table damage and the table data cannot be read. In this case, you can use the MySQL tool, myisamchk and mysqlcheck.

Myisamchk:Only myisam tables can be repaired. you need to stop the database.

Common parameters:

-F -- force repair, overwrite the old temporary files, usually not used

-R -- recover recovery mode

-Q -- quik quick recovery

-A -- analyze analysis table

-O -- safe-recover: Old recovery mode. if-r cannot be repaired, use this parameter to try.

-F -- fast: only checks tables that are not normally closed.

Quick repair of the weibo database:

# Cd/var/lib/mysql/weibo

# Myisamchk-r-q *. MYI

Mysqlcheck:Myisam and innodb tables can be used without stopping the database. to fix a single table, you can add a table name after the database and separate it with spaces.

Common parameters:

-A -- all-databases: check all databases

-R -- repair

-C -- check checklist, default option

-A -- analyze analysis table

-O -- optimize optimization table

-Q -- quik: fastest table check or repair

-F -- fast: only checks tables that are not normally closed.

Quick repair of the weibo database:

Mysqlcheck-r-q-uroot-p123 weibo

5.5 In addition, view CPU and I/O performance methods

# View CPU performance

# Parameter-P indicates the number of CPUs displayed. If ALL is set to ALL, only the number of CPUs can be displayed.

# View I/O performance

# The parameter-m is displayed in MB. the default value is K.

# % Util: when it reaches 100%, it indicates that I/O is very busy.

# Await: The waiting time of a request in the queue directly affects the read time.

I/O limit: IOPS (r/s + w/s), usually about 1200 for RAID 0/10. (IOPS, read/write (I/O) operations per second)

I/O bandwidth: in sequential read/write mode, the theoretical value of the SAS hard disk is about 300 MB/s, and that of the SSD hard disk is about 600 MB/s.


The above are some of the major optimization solutions I have summarized over the past three years using MySQL, which have limited capabilities and are not comprehensive. However, these solutions can basically meet the database requirements of small and medium enterprises.

Due to the limitations of the original design of relational databases, some BAT companies put massive amounts of data into relational databases, which has not achieved better performance in massive data query and analysis. As a result, NoSQL is very popular. non-relational databases, large data volumes, and high performance make up for some shortcomings of relational databases. most companies have gradually put some business data into NoSQL, such as MongoDB and HBase. Distributed file systems are used for data storage, such as HDFS and GFS. Massive data computing and analysis uses Hadoop, Spark, and Storm. These are cutting-edge technologies related to O & M, and are also the main learning objects in terms of storage. let's work together! Which of the following bloggers has a better optimization solution? please contact us.

This article is from the blog "Li Zhenliang's Technical Blog ".

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.