Master MySQL database These optimization skills, more than a multiplier!

Source: Internet
Author: User
Tags memcached server memory

A mature database architecture is not designed to be high availability, high scalability and other features, it is with the increase in user volume, the infrastructure is gradually improved. This article mainly discusses the MySQL database in the development cycle of the problems and optimization plan, aside from the front end application is not said, roughly divided into the following five stages:

Phase one: database table design

After the project is approved, the development department develops the project according to the product department demand.
The development engineer will design the table structure at the beginning of the development project. For the database, table structure design is very important, if the design is inappropriate, will directly affect the user to visit the site speed, user experience is not good! There are many factors in this situation, such as slow queries (inefficient query statements), inappropriate indexing, database blocking (locks), and so on. Of course, there is a team of testing department, will do product testing, find bugs.
Because the development engineers pay attention to the difference, the initial will not consider too many database design is reasonable, but to complete the implementation and delivery of functions as soon as possible. such as the project on-line has a certain amount of traffic, hidden problems will be exposed, then to modify it is not so easy!

Phase II: Database Deployment

It's time for the OPS engineer to appear and the project is on the line.
There are typically few initial visits to the project, and this phase of the web+ database is a single deployment sufficient to cope with the QPS (query rate per second) around 1000. Considering the single point of failure, high availability should be achieved, using MySQL master-slave replication +keepalived to achieve dual-machine hot standby. Mainstream ha software is: keepalived (recommended), Heartbeat.

Phase Three: Database performance optimization

If MySQL is deployed to a normal X86 server, the MySQL theoretical value normally can handle around 1500 QPS without any optimization, and after optimization it may be raised to around 2000 QPS. Otherwise, when the traffic reaches about 1500 concurrent connections, the database processing performance may be slow to respond, and the hardware resources are relatively wealthy, it is time to consider performance optimization issues. So how do you get the maximum performance out of your database? Mainly from the hardware configuration, database configuration, the structure of the beginning, specifically divided into the following:

3.1 Hardware configuration

If there is a requirement that SSDs replace the SAS mechanical hard drives, the raid level is adjusted to raid1+0, with better read and write performance compared to RAID1 and RAID5, after all, the database pressure mainly comes from the disk I/O aspect.
The Linux kernel has a feature that divides buffers (system caches and data caches) from physical memory to hold hot data, which is synchronized to disk through file system deferred write mechanisms, such as when a buffer size reaches a certain percentage or the Sync command is fulfilled. In other words, the larger the physical memory, the larger the allocation buffer, and the more cache data. Of course, a server failure can lose some cached data. It is recommended that physical memory be at least 50% richer.

3.2 Database Configuration optimization

MySQL is the most widely used storage engine: One is MyISAM, does not support transaction processing, read performance processing fast, table-level lock. The other is InnoDB, which supports transaction processing (ACID properties), and the design goal is for large data processing, row-level locks.
Table Lock: The cost is small, the lock granularity is large, the deadlock probability is high, the relative concurrency is also low.
Row lock: High overhead, small lock granularity, the probability of deadlock is low, relative concurrency is higher.
Why are table locks and row locks present? Primarily to ensure data integrity. For example, a user in the operation of a table, other users want to operate the table, then the first user to finish the operation, other users can operate, table lock and Row lock is the role. Otherwise, multiple users can manipulate a table at the same time, which will certainly result in conflicting or abnormal data.
Depending on these aspects, using the InnoDB storage engine is the best choice and is also the mysql5.5+ version of the default storage engine. Each storage engine has more relevant run parameters, and the following lists parameters that may affect database performance.
Default values for public parameters:

max_connections = 151# 同时处理最大连接数,建议设置最大连接数是上限连接数的80%左右sort_buffer_size = 2M# 查询排序时缓冲区大小,只对order by和group by起作用,建议增大为16Mopen_files_limit = 1024 # 打开文件数限制,如果show global status like ‘open_files‘查看的值等于或者大于open_files_limit值时,程序会无法连接数据库或卡死

MyISAM parameter Default value:

key_buffer_size = 16M# 索引缓存区大小,一般设置物理内存的30-40%read_buffer_size = 128K  # 读操作缓冲区大小,建议设置16M或32Mquery_cache_type = ON# 打开查询缓存功能query_cache_limit = 1M  # 查询缓存限制,只有1M以下查询结果才会被缓存,以免结果数据较大把缓存池覆盖query_cache_size = 16M  # 查看缓冲区大小,用于缓存SELECT查询结果,下一次有同样SELECT查询将直接从缓存池返回结果,可适当成倍增加此值

InnoDB parameter Default value:

innodb_buffer_pool_size = 128M# 索引和数据缓冲区大小,建议设置物理内存的70%左右innodb_buffer_pool_instances = 1    # 缓冲池实例个数,推荐设置4个或8个innodb_flush_log_at_trx_commit = 1  # 关键参数,0代表大约每秒写入到日志并同步到磁盘,数据库故障会丢失1秒左右事务数据。1为每执行一条SQL后写入到日志并同步到磁盘,I/O开销大,执行完SQL要等待日志读写,效率低。2代表只把日志写入到系统缓存区,再每秒同步到磁盘,效率很高,如果服务器故障,才会丢失事务数据。对数据安全性要求不是很高的推荐设置2,性能高,修改后效果明显。innodb_file_per_table = OFF  # 是否共享表空间,5.7+版本默认ON,共享表空间idbdata文件不断增大,影响一定的I/O性能。建议开启独立表空间模式,每个表的索引和数据都存在自己独立的表空间中,可以实现单表在不同数据库中移动。innodb_log_buffer_size = 8M  # 日志缓冲区大小,由于日志最长每秒钟刷新一次,所以一般不用超过16M
3.3 System kernel parameter optimization

Most MySQL is deployed on Linux systems, so some parameters of the operating system can also affect MySQL performance, and the following are appropriate optimizations for Linux kernel parameters

net.ipv4.tcp_fin_timeout = 30# TIME_WAIT超时时间,默认是60snet.ipv4.tcp_tw_reuse = 1    # 1表示开启复用,允许TIME_WAIT socket重新用于新的TCP连接,0表示关闭net.ipv4.tcp_tw_recycle = 1   # 1表示开启TIME_WAIT socket快速回收,0表示关闭net.ipv4.tcp_max_tw_buckets = 4096   # 系统保持TIME_WAIT socket最大数量,如果超出这个数,系统将随机清除一些TIME_WAIT并打印警告信息net.ipv4.tcp_max_syn_backlog = 4096# 进入SYN队列最大长度,加大队列长度可容纳更多的等待连接在Linux系统中,如果进程打开的文件句柄数量超过系统默认值1024,就会提示“too many files open”信息,所以要调整打开文件句柄限制。重启永久生效:# vi /etc/security/limits.conf  
Phase four: Database schema extensions

As the volume of business becomes larger, the performance of a single database server can no longer meet business requirements, and this consideration increases the server expansion architecture. The main idea is to decompose a single database load, break through disk I/O performance, heat data storage cache, reduce disk I/O access frequency.

4.1 Increasing the cache

Add the cache system to the database, cache the hot data into memory, and if the requested data in the cache no longer requests MySQL, reduce the database load. The cache implementation has both local and distributed caches, and the local cache caches the data in the local server memory or in the file. Distributed cache can cache massive data, good extensibility, mainstream distributed cache system: memcached, redis,memcached performance is stable, data cache in memory, fast, QPS theory can reach about 8w. If you want to persist data, choose Redis, and the performance is no less than memcached.
Working process:

4.2 Master-slave copy and read/write separation

In the production environment, the business system is usually read and write less, can deploy a master multi-slave architecture, the main database is responsible for write operations, and do dual-machine hot standby, more than one from the database to do load balancing, responsible for read operations. Mainstream load balancer: LVS, HAProxy, Nginx.
How to achieve read and write separation? Most organizations implement read-write separation at the code level, and are highly efficient. Another way through the agent program to achieve the separation of read and write, the enterprise application less, will increase the middleware consumption. Mainstream middleware agent system has mycat, Atlas.
In this MySQL master-slave replication Topology architecture, distributed single load, greatly improve the database concurrency capability. If a server can handle the QPS, then 3 can handle 4500 QPS, and it is easy to scale horizontally.
Sometimes, a single write performance is less than the business requirement when faced with a large number of write operations. You can do two-way replication (dual-master), but there is a problem to note: If both primary servers provide read and write operations, you may encounter data inconsistency, the reason is that the program has to operate two databases at the same time, the update operation will cause two database data conflict or inconsistent. Each table ID field can be set to self-increment unique: Auto_increment_increment and Auto_increment_offset, or the write algorithm generates random uniqueness.
There is also an alternative: mysql-mmm (master-master replication managerfor mysql,mysql primary master Replication Manager), a master-slave replication management tool based on Perl language development, The biggest advantage of this suite is that only one database write operation is provided at the same time, which ensures data consistency effectively. However, the maintenance costs are increased and need careful consideration.

4.3 min Library

Sub-Library is based on the business to separate the related tables in the database into different databases, such as the Web, BBS, blog and other libraries. If the volume of business is large, you can also take the separation of the database from the replication schema, to further avoid the single-Library pressure too large.

4.4 Minute Table

VCDs increase in data volume, a table in the database has millions of of data, resulting in query and insertion time is too long, how to solve the single-table pressure? You should consider splitting the table into smaller tables to relieve the pressure on a single table and improve processing efficiency, which is called a sub-table.
Sub-table technology is troublesome, to modify the code of the SQL statements, but also manually to create other tables, you can also use the merge storage engine to implement the table, relatively simple and much more. After the table, the program is to operate on a general table, the total table does not hold data, only some of the relationship between the table, and the way to update the data, the general table according to different queries, the pressure on different small tables, so improve concurrency and disk I/O performance.
The Sub-table is divided into vertical split and horizontal split:
Vertical split: Split the original table with a number of fields into multiple tables, solving the table width problem. You can put infrequently used fields in a single table, or you can put large fields in a single table, or put a closely related field in a table.
Horizontal split: Split the original table into multiple tables, the structure of each table is the same, to solve the problem of large amount of single-table data.

4.5 Partitioning

Partitioning is a table of data based on the table structure of the fields (such as range, list, hash, etc.) into multiple chunks, which can be on a disk, or on a different disk, after partitioning, the surface is a table, but the data is hashed in multiple locations, so that, Multiple drives handle different requests at the same time, improving disk I/O read and write performance.
Note: the addition of caches, libraries, tables, and partitions is primarily implemented by program apes or DBAs.

Phase five: Database maintenance

Database maintenance is the work of database engineers or operations engineers, including system monitoring, performance analysis, performance tuning, database backup and recovery, and other major tasks.

5.1 Performance Status Key indicators

Terminology: QPS (Queries per Second, query book per second) and TPS (transactions per Second)
With show status, there are over 300 status information records, several of which help us calculate the QPS and TPS, as follows:

Uptime:服务器已经运行的实际,单位秒Questions:已经发送给数据库查询数Com_select:查询次数,实际操作数据库的Com_insert:插入次数Com_delete:删除次数Com_update:更新次数Com_commit:事务次数Com_rollback:回滚次数

So, the calculation method comes, based on the questions calculation of the QPS

mysql> show global status like ‘Questions‘;mysql> show global status like ‘Uptime‘;QPS = Questions / Uptime

The TPS is calculated based on Com_commit and Com_rollback:

mysql> show global status like ‘Com_commit‘;mysql> show global status like ‘Com_rollback‘;mysql> show global status like ‘Uptime‘;TPS = (Com_commit + Com_rollback) / Uptime

Another way to calculate:

基于Com_select、Com_insert、Com_delete、Com_update计算出QPS:   mysql> show global status where Variable_name in(‘com_select‘,‘com_insert‘,‘com_delete‘,‘com_update‘);等待1秒再执行,获取间隔差值,第二次每个变量值减去第一次对应的变量值,就是QPS。

TPs Calculation method:

mysql> show global status where Variable_name in(‘com_insert‘,‘com_delete‘,‘com_update‘);计算TPS,就不算查询操作了,计算出插入、删除、更新四个值即可。

By the Netizen to these two calculation way test obtains, when the database MyISAM the table compares the long time, uses the questions computation to be more accurate. When the InnoDB table in the database is relatively long, the com_* calculation is more accurate.

5.2 Turn on Slow query log

MySQL turn on slow query log, analyze which SQL statement is slow, support dynamic open:

mysql> set global slow-query-log=on  # 开启慢查询日志 mysql> set global slow_query_log_file=‘/var/log/mysql/mysql-slow.log‘;  # 指定慢查询日志文件位置 mysql> set global log_queries_not_using_indexes=on;   # 记录没有使用索引的查询 mysql> set global long_query_time=1;   # 只记录处理时间1s以上的慢查询分析慢查询日志,可以使用MySQL自带的mysqldumpslow工具,分析的日志较为简单。mysqldumpslow -t 3 /var/log/mysql/mysql-slow.log    # 查看最慢的前三个查询也可以使用percona公司的pt-query-digest工具,日志分析功能全面,可分析slow log、binlog、general log。分析慢查询日志:pt-query-digest /var/log/mysql/mysql-slow.log分析binlog日志:mysqlbinlog mysql-bin.000001 >mysql-bin.000001.sql pt-query-digest --type=binlog mysql-bin.000001.sql 分析普通日志:pt-query-digest --type=genlog localhost.log
5.3 Database Backup

Backup database is the most basic work, but also the most important, or the consequences are very serious, you know! High-frequency backup strategy, it is important to choose a stable and fast tool. Database size within 2G, it is recommended to use the official logical Backup tool mysqldump. More than 2G, it is recommended to use the Percona Company's physical backup tool xtrabackup, or slow with the snail like. Both tools support the InnoDB storage engine for hot provisioning without impacting business read and write operations.

5.4 Database Repair

Sometimes the MySQL server suddenly loses power and shuts down abnormally, causing the table to become corrupted and unable to read the table data. At this point, you can use the MySQL comes with two tools to repair, Myisamchk and Mysqlcheck. The former can only repair the MyISAM table, and stop the database, the latter MyISAM and InnoDB can be, online repair.
Note: It is a good idea to back up the database before repair.

myisamchk常用参数:  -f --force    强制修复,覆盖老的临时文件,一般不使用  -r --recover  恢复模式  -q --quik     快速恢复  -a --analyze  分析表  -o --safe-recover 老的恢复模式,如果-r无法修复,可以使用此参数试试  -F --fast     只检查没有正常关闭的表例如:myisamchk -r -q *.MYI
mysqlcheck常用参数:  -a  --all-databases  检查所有的库  -r  --repair   修复表  -c  --check    检查表,默认选项  -a  --analyze  分析表  -o  --optimize 优化表  -q  --quik   最快检查或修复表  -F  --fast   
5 MySQL server performance analysis


Focus on:
Id:cpu utilization percentage, average less than 60% normal, but already busy.
WA:CPU wait for disk IO response time, typically greater than 5 indicates a large disk read and write volume.

KB_READ/S, KB_WRTN/S Read and write data per second, mainly based on the maximum read/write speed of the disk per second evaluation.

R/S, w/s: The number of read and write requests per second, which can be understood as IOPS (input/output per second), is one of the main metrics to measure disk performance.
Await:io average response time per second, typically greater than 5 indicates that the disk response is slow and exceeds its own performance.
Util: Disk utilization percentage, average less than 60% normal, but already busy.

Summary

Due to the design limitations of relational database, it will be inadequate in large data processing. So NoSQL (non-relational database) fire up, natural inspirational, with distributed, high performance, highly reliable and other characteristics, to compensate for a relational database in some aspects of congenital deficiencies, is very suitable for storing unstructured data. Mainstream NoSQL databases are: MongoDB, HBase, Cassandra and so on.

Simple database level optimization effect is not much obvious, the main thing is to choose the right database according to the business scenario!

More good text please go to homepage: Http://blog.51cto.com/lizhenliang

High salary compulsory Course: http://edu.51cto.com/lecturer/7876557.html

Master MySQL database These optimization skills, more than a multiplier!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.