On the optimization of MySQL database in operation and maintenance angle

Last Update:2015-06-02 Source: Internet

Author: User

Tags memcached set set server memory

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A mature database architecture is not designed to be high availability, high scalability and other features, it is with the increase in user volume, the infrastructure is gradually improved. The development cycle of the database is broadly divided into the following five phases:

1. database table Design

After the project is developed, the development department develops the project according to the demand of the product department, and part of the development engineer work is the table structure design. This is important for the database, which can directly affect the speed of access and the user experience if poorly designed. There are many factors, such as slow query, inefficient query statement, not properly indexed, Database Jam (deadlock), and so on. Of course, a team of test engineers will do stress tests to find bugs. For a team without a test engineer, most development engineers will not consider the database design too much early on, but rather as soon as possible to complete the implementation and delivery of the function, and so on the project has a certain amount of traffic, hidden problems will be exposed, then to modify it is not so easy.

2. Database deployment

The OPS engineer is on the pitch, and the initial number of visits will not be large, so a single deployment is sufficient to cope with the QPS at around 1500 (query rate per second). Considering the high availability, can use MySQL master-slave replication +keepalived do double-click hot, common cluster software has keepalived, Heartbeat.

Dual-Machine Hot Standby Blog: http://lizhenliang.blog.51cto.com/7876557/1362313

3, Database performance optimization

If MySQL is deployed to a normal X86 server, without any optimization, the MySQL theoretical value normally can handle about 1500 concurrent connections (QPS), after optimization, it is possible to increase to about 2000 QPS, otherwise, when the traffic reaches more than 1000 concurrent connections, Database processing performance slows down and hardware resources are rich, so it's time to consider software issues. So how do you maximize the performance of your database? On the one hand can run multiple MySQL instances to maximize the performance of the server, on the other hand, the database is optimized, often the operating system and database default configuration is relatively conservative, the database will play a certain limit, the configuration can be adjusted appropriately, as much as possible to handle more connections.

There are three levels of specific optimizations:

3.1 Database Configuration optimization

MySQL often has two kinds of storage engine, one is MyISAM, does not support transaction processing, read performance processing fast, table level lock. The other is InnoDB, which supports transaction processing (ACID), and the design goal is to maximize performance for processing high-volume data, row-level locks.

Table Lock: The cost is small, the lock granularity is large, the deadlock probability is high, the relative concurrency is also low.

Row lock: High overhead, small lock granularity, the probability of deadlock is low, relative concurrency is higher.

Why are table locks and row locks present? Mainly to ensure the integrity of the data, for example, a user in the operation of a table, other users also want to operate this table, then the first user will be finished, other users can operate, table lock and Row lock is the role. Otherwise, multiple users can manipulate a table at the same time, which will certainly result in conflicting or abnormal data.

according to the above, using the InnoDB storage engine is the best choice and is also the default storage engine in later versions of MySQL5.5. Each storage engine has more associated parameters, and the following lists the parameters that primarily affect database performance.

Default values for public parameters:

Max_connections = 151# simultaneously handles the maximum number of connections, the recommended setting is about 80% of the maximum number of connections sort_buffer_size = 2m# The buffer size when the query is sorted, only works on order by and group by. This value can be increased to 16mquery_cache_limit = 1M #查询缓存限制, only 1 m the following query results will not be cached, so as to avoid large result data overwrite the cache pool query_cache_size = 16M #查看缓冲区大小 for caching Select query results , the next time you have the same select query will return results directly from the cache pool, which can be appropriately multiplied by this value Open_files_limit = 1024x768 #打开文件数限制 If show global status like ' Open_files ' When you view a value that is equal to or greater than the Open_files_limit value, the program cannot connect to the database or the card dies

MyISAM parameter Default value:

key_buffer_size = 16m# Index buffer size, general setting of physical memory 30-40% read_buffer_size = 128K #读操作缓冲区大小, recommended setting 16M or 32M

InnoDB parameter Default value:

Innodb_buffer_pool_size = 128m# index and data buffer size, general set physical memory 60%-70%innodb_buffer_pool_instances = 1 #缓冲池实例个数, recommended setting 4 or 8 InnoDB _flush_log_at_trx_commit = 1 #关键参数, 0 means that approximately every second is written to the log and synchronized to disk, and the database failure loses about 1 seconds of transactional data. 1 writes to the log after each SQL execution and synchronizes to disk, I/O overhead, SQL to wait for the log to read and write, inefficient. 2 means that only the log is written to the system buffer, and then synchronized to disk per second, the efficiency is very high, if the server fails, the transaction data will be lost. The data security requirements are not very high recommended settings 2, high performance, modified after the effect is obvious. innodb_file_per_table = OFF #默认是共享表空间, the shared tablespace idbdata file grows, affecting certain I/O performance. It is recommended to turn on the standalone tablespace mode, where each table's index and data exist in its own table space, enabling a single table to move in different databases. Innodb_log_buffer_size = 8M #日志缓冲区大小, because the log is refreshed every second, generally not more than 16M

3.2 System Configuration Optimization

Most MySQL is deployed on Linux systems, so some of the operating system's parameters also affect MySQL performance, and the following are appropriate optimizations for the Linux kernel.

Net.ipv4.tcp_fin_timeout = 30#time_wait Timeout, the default is 60snet.ipv4.tcp_tw_reuse = 1 #1表示开启复用, allowing the time_wait socket to be re-used for new TCP connections, 0 = Turn off net.ipv4.tcp_tw_recycle = 1 #1表示开启TIME_WAIT Socket Fast Recycle, 0 means close net.ipv4.tcp_max_tw_buckets = 4096 #系统保持TIME_WAIT Soc Ket maximum number, if this number is exceeded, the system will randomly clear some time_wait and print the warning message Net.ipv4.tcp_max_syn_backlog = 4096# into the SYN queue maximum length, increase the queue length can accommodate more waiting for the connection

In a Linux system, if a process opens a file handle that exceeds the system default of 1024, the "Too many files open" message is prompted, so you want to adjust the open file handle limit.

# vi/etc/security/limits.conf #加入以下配置, * on behalf of all users, can also specify user, restart system effective * Soft Nofile 65535* hoft nofile 65535# ulimit-shn 65535 #立刻生效

3.3 Hardware configuration

Increase physical memory, in order to improve file system performance, the Linux kernel allocates buffers (System cache and file cache) from memory to hold hot data, that is, the larger the physical memory, the larger the allocation buffer, the more cache data.

SSD drives instead of SAS hard drives, with RAID levels tuned to raid1+0, with better read and write performance (IOPS) compared to RAID1 and RAID5, after all, the pressure on the database comes mainly from disk I/O.

4. Database schema Extension

With the increasing volume of business, the performance of a single database server can not meet the business requirements, the consideration of adding machines, the cluster has to do ~ ~ ~. The main idea is to decompose a single database load, break through disk I/O performance, heat data storage cache, reduce disk I/O access frequency.

4.1 Master-slave copy and read/write separation

Because the production environment, the database is mostly read operations, so the deployment of a master multi-slave architecture, the main database is responsible for write operations, and do double-click Hot Standby, multiple from the database to do load balancing, read operations, the mainstream load balancer has LVS, HAProxy, Nginx. How to achieve read and write separation? Most enterprises are at the code level to achieve read and write separation, high efficiency. Another way through the agent to achieve the separation of read and write, the enterprise less application, the common agent has MySQL proxy, amoeba. In such a database cluster architecture, greatly increase the database high concurrency capability, solve the single performance bottleneck problem. If you can handle the QPS from a database from a library, then 5 can handle 1w QPS, and the database scale-out is easy.

Master-slave Copy blog: http://lizhenliang.blog.51cto.com/7876557/1290431

Read-Write Separation blog: http://lizhenliang.blog.51cto.com/7876557/1305083

4.2 Increasing the cache

Increase the cache system for the database, improve read performance, cache implementation has local cache and distributed cache, local cache is to cache the data in the local server memory or file, fast. Distributed cache of large amounts of data, easy to expand, the mainstream distributed cache system has memcached, redis,memcached performance stability, data cache in memory, fast, QPS up to 8w, you can cache hot data into memory, If there is data to be requested in the memory cache, the results are no longer returned to the database.

4.3 min Library

The Sub-Library is based on the business different to the related tables into different databases, such as the Web, BBS, blog and other libraries. If the volume of business is large, you can also take the post-segmentation library from the schema to further avoid a single library pressure too much.

4.4 Minute Table

VCDs increase in data volume, a table in the database has millions of of data, resulting in query and insertion time is too long, how to solve the single-table pressure? You should consider splitting the table into smaller tables to relieve the pressure on a single table and improve processing efficiency, which is called a sub-table.

Sub-table technology is troublesome, to modify the code of the SQL statements, but also manually to create other tables, you can also use the merge storage engine to implement the table, relatively simple and much more. After the table, the program is to operate on a general table, the total table does not hold data, only some of the relationship between the table, and the way to update the data, the general table according to different queries, the pressure on different small tables, so improve concurrency and disk I/O performance.

The Sub-table is divided into vertical split and horizontal split:

Vertical split: Split the original table with a number of fields into multiple tables, solving the table width problem. You can put infrequently used fields in a single table, or you can put large fields in a single table, or put a closely related field in a table.

Horizontal split: Split the original table into multiple tables, the structure of each table is the same, to solve the problem of large amount of single-table data.

4.5 Partitioning

Partitioning is a table of data into a number of chunks, which can be on a disk, or on a different disk, partition, the surface is still a table, but the data is hashed in multiple locations, so that multiple hard disks simultaneously processing different requests, thereby improving disk I/O read and write performance, the implementation is relatively simple.

5. Database Maintenance

Database maintenance is the primary work of operations engineers or DBAs, including performance monitoring, performance analysis, performance tuning, database backup and recovery, and more.

5.1 Turn on slow query log

MySQL turn on slow query log, analyze which SQL statement is slow, use Set set variable, restart service failure, you can add parameters in my.cnf permanent.

Mysql> set global Slow-query-log=on #开启慢查询功能mysql > Set global slow_query_log_file= '/var/log/mysql/  Mysql-slow.log ';   #指定慢查询日志文件位置mysql > Set global log_queries_not_using_indexes=on;   #记录没有使用索引的查询mysql > Set global long_query_time=1; #只记录处理时间1s以上的慢查询

Analyze the slow query log, you can use the MySQL Mysqldumpslow tool, the analysis of the log is relatively simple.

# mysqldumpslow-t 3/var/log/mysql/mysql-slow.log #查看最慢的前三个查询

can also use Percona Company's pt-query-digest tool, log analysis function is comprehensive, can analyze slow log, Binlog, general log.

Parsing slow query log: Pt-query-digest/var/log/mysql/mysql-slow.log

Analysis Binlog log: Mysqlbinlog mysql-bin.000001 >mysql-bin.000001.sql

Pt-query-digest--type=binlog Mysql-bin.000001.sql

Analyze normal log: Pt-query-digest--type=genlog localhost.log

5.2 Database Backup

Backup database is the most basic work, but also the most important, or the consequences are very serious, you know! However, because the database is large, hundreds of G, often backup is very time-consuming, so it should choose an efficient backup strategy, for the data volume of the database, generally use incremental backup. Common backup tools are mysqldump, Mysqlhotcopy, Xtrabackup, and so on, mysqldump comparison is suitable for small databases, because it is a logical backup, so backup and recovery time is longer. Mysqlhotcopy and Xtrabackup are physical backups, fast backup and recovery, no impact on database service, hot copy, recommended use of xtrabackup, support for incremental backups. Interested to refer to the previous blog: http://lizhenliang.blog.51cto.com/7876557/1612800

5.3 Database Repair

Sometimes the MySQL server suddenly loses power and shuts down abnormally, causing the table to become corrupted and unable to read the table data. At this point, you can use the MySQL comes with two tools to repair, Myisamchk and Mysqlcheck.

Myisamchk: can only repair MyISAM table, need to stop database

Common parameters:

-F--force forced repair, overwriting old temporary files, generally not used

-R--recover Recovery mode

-Q--quik Fast Recovery

-A--analyze analysis table

-o--safe-recover old recovery mode, if-r cannot be repaired, you can use this parameter to try

-F--fast Only check for tables that are not properly closed

Quick fix Weibo database:

# Cd/var/lib/mysql/weibo

# myisamchk-r-Q *. MYI

Mysqlcheck:MyISAM and InnoDB tables are all available, do not need to stop the database, such as repairing a single table, you can add the table name after the database, separated by a space

Common parameters:

-A--all-databases Check all libraries

-R--repair Repair table

-C--check Check table, default options

-A--analyze analysis table

-O--optimize Optimization table

-Q--quik fastest check or repair table

-F--fast Only check for tables that are not properly closed

Quick fix Weibo database:

Mysqlcheck-r-q-uroot-p123 Weibo

These are some of the major optimizations that I've summed up with MySQL over the past three years, with limited capabilities and some less comprehensive, but these are basic to meeting the needs of small and midsize business databases. Due to the design limitations of the relational database, some bat companies have put huge amounts of data into the relational database and have not been able to achieve better performance in large-volume data query and analysis. So NoSQL fire up, non-relational database, big data volume, with high performance, but also to compensate for the relational database in some aspects, and gradually most companies have put some of the business data inventory into NoSQL, such as MongoDB, HBase and so on. The data storage aspect uses the Distributed file system, such as HDFs, GFS and so on. Massive data analysis uses Hadoop, Spark, Storm, and more. These are related to the operation and maintenance of cutting-edge technology, but also in the storage of the main learning objects, small partners together to refuel it! Which Bo friends have a better solution, welcome to Exchange OH.

This article is from the "Penguin" blog, please be sure to keep this source http://lizhenliang.blog.51cto.com/7876557/1657465

On the optimization of MySQL database in operation and maintenance angle

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More