Transferred from: http://lizhenliang.blog.51cto.com/7876557/1657465
A mature database architecture is not designed to be high availability, high scalability and other features, it is with the increase in user volume, the infrastructure is gradually improved. This blog post mainly on the MySQL database development cycle in the face of problems and optimization plan, aside from the front end of the application is not said, roughly divided into the following five stages:
1. database table Design
After the project is developed, the development department develops the project according to the demand of the product department, and part of the development engineer work is the table structure design. This is important for the database, which can directly affect the speed of access and the user experience if poorly designed. There are many factors, such as slow query, inefficient query statement, not properly indexed, Database Jam (deadlock), and so on. Of course, a team of test engineers will do stress tests to find bugs. For a team without a test engineer, most development engineers will not consider the database design too much early in the beginning, but rather as soon as possible to complete the implementation of the function and transfer payment, and other projects have a certain amount of traffic, hidden problems will be exposed, then to modify it is not so easy.
2. Database deployment
The OPS engineer is on the pitch, and the initial number of visits will not be large, so a single deployment is sufficient to cope with the QPS at around 1500 (query rate per second). Considering the high availability, can use MySQL master-slave replication +keepalived do double-click hot, common cluster software has keepalived, Heartbeat.
Dual-Machine Hot Standby Blog: http://lizhenliang.blog.51cto.com/7876557/1362313
3, Database performance optimization
If MySQL is deployed to a normal X86 server, without any optimization, the MySQL theoretical value normally can handle about 2000 QPS, after optimization, it may be raised to about 2500 QPS, otherwise, when the traffic reaches about 1500 concurrent connection, Database processing performance slows down and hardware resources are rich, so it's time to consider software issues. So how do you maximize the performance of your database? On the one hand can run multiple MySQL instances to maximize the performance of the server, on the other hand, the database is optimized, often the operating system and database default configuration is relatively conservative, the database will play a certain limit, the configuration can be adjusted appropriately, as much as possible to handle more connections.
There are three levels of specific optimizations:
3.1 Database Configuration optimization
MySQL often has two kinds of storage engine, one is MyISAM, does not support transaction processing, read performance processing fast, table level lock. The other is InnoDB, which supports transaction processing (ACID), and the design goal is to maximize performance for processing high-volume data, row-level locks.
Table Lock: The cost is small, the lock granularity is large, the deadlock probability is high, the relative concurrency is also low.
Row lock: High overhead, small lock granularity, the probability of deadlock is low, relative concurrency is higher.
Why are table locks and row locks present? Mainly to ensure the integrity of the data, for example, a user in the operation of a table, other users also want to operate this table, then the first user will be finished, other users can operate, table lock and Row lock is the role. Otherwise, multiple users can manipulate a table at the same time, which will certainly result in conflicting or abnormal data.
According to the above, using the InnoDB storage engine is the best choice and is also the default storage engine in later versions of MySQL5.5. Each storage engine has more associated parameters, and the following lists the parameters that primarily affect database performance.
Default values for public parameters:
1 2 3 4 5 6 7 8 9 10 |
max_connections = 151 #同时处理最大连接数,推荐设置最大连接数是上限连接数的80%左右 sort_buffer_size = 2M #查询排序时缓冲区大小,只对order by和group by起作用,可增大此值为16M query_cache_limit = 1M #查询缓存限制,只有1M以下查询结果才会被缓存,以免结果数据较大把缓存池覆盖 query_cache_size = 16M #查看缓冲区大小,用于缓存SELECT查询结果,下一次有同样SELECT查询将直接从缓存池返回结果,可适当成倍增加此值 open_files_limit = 1024 #打开文件数限制,如果show global status like ‘open_files‘查看的值等于或者大于open_files_limit值时,程序会无法连接数据库或卡死 |
MyISAM parameter Default value:
1 2 3 4 |
key_buffer_size = 16M #索引缓存区大小,一般设置物理内存的30-40% read_buffer_size = 128K #读操作缓冲区大小,推荐设置16M或32M |
InnoDB parameter Default value:
1 2 3 4 5 6 7 8 9 10 |
innodb_buffer_pool_size = 128m #索引和数据缓冲区大小, generally sets the 60%-70% of physical memory innodb_buffer_pool_instances = 1 #缓冲池实例个数, recommended setting 4 or 8 innodb_flush_log_at_trx_commit = 1 #关键参数, 0 represents about every second written to the log and synchronizes to disk, and database failures can lose about 1 seconds of transactional data. 1 writes to the log after each SQL execution and synchronizes to disk, I/O overhead, SQL to wait for the log to read and write, inefficient. 2 means that only the log is written to the system buffer, and then synchronized to disk per second, the efficiency is very high, if the server fails, the transaction data will be lost. The data security requirements are not very high recommended settings 2, high performance, modified after the effect is obvious. innodb_file_per_table = off #默认是共享表空间, shared tablespace idbdata files are increasing, affecting certain I/O performance. It is recommended to turn on the standalone tablespace mode, where each table's index and data exist in its own table space, enabling a single table to move in different databases. innodb_log_buffer_size = 8m #日志缓冲区大小, because the log is refreshed every second, it is generally not more than 16M |
3.2 System Kernel Optimization
Most MySQL is deployed on Linux systems, so some of the operating system's parameters also affect MySQL performance, and the following are appropriate optimizations for the Linux kernel.
1 2 3 4 5 6 7 8 9 |
net.ipv4.tcp_fin_timeout = 30 #TIME_WAIT超时时间, default is 60s net.ipv4.tcp_tw_reuse = 1 #1表示开启复用, allow Time_wait socket to be re-used for new TCP connections, 0 to close net.ipv4.tcp_tw_recycle = 1 #1表示开启TIME_WAIT socket Fast Recycle, 0 means close net.ipv4.tcp_max_tw_buckets = 4096 #系统保持TIME_WAIT socket Maximum number, if this number is exceeded, the system will randomly clear some time_wait and print the warning message net.ipv4.tcp_max_syn_backlog = 4096 #进入SYN队列最大长度, increase the queue length to accommodate more waiting connections |
In a Linux system, if a process opens a file handle that exceeds the system default of 1024, the "Too many files open" message is prompted, so you want to adjust the open file handle limit.
1 2 3 4 |
# vi /etc/security/limits.conf #加入以下配置,*代表所有用户,也可以指定用户,重启系统生效 * soft nofile 65535 * hard nofile 65535 # ulimit -SHn 65535 #立刻生效 |
3.3 Hardware configuration
Increase physical memory and improve file system performance. The Linux kernel allocates buffers (system caches and data caches) from memory to hold hot data, which is synchronized to disk through file system deferred write mechanisms, such as when a buffer size reaches a certain percentage or the Sync command is fulfilled. In other words, the larger the physical memory, the larger the allocation buffer, and the more cache data. Of course, a server failure can lose some cached data.
SSD drives instead of SAS hard drives, with RAID levels tuned to raid1+0, with better read and write performance (IOPS) compared to RAID1 and RAID5, after all, the pressure on the database comes mainly from disk I/O.
4. Database schema Extension
With the increasing volume of business, the performance of a single database server can not meet the business requirements, the consideration of adding machines, the cluster has to do ~ ~ ~. The main idea is to decompose a single database load, break through disk I/O performance, heat data storage cache, reduce disk I/O access frequency.
4.1 Master-slave copy and read/write separation
Because the production environment, the database is mostly read operations, so the deployment of a master multi-slave architecture, the main database is responsible for write operations, and do double-click Hot Standby, multiple from the database to do load balancing, read operations, the mainstream load balancer has LVS, HAProxy, Nginx.
How to achieve read and write separation? Most enterprises are at the code level to achieve read and write separation, high efficiency. Another way through the agent to achieve the separation of read and write, the enterprise less application, the common agent has MySQL proxy, amoeba. In such a database cluster architecture, greatly increase the database high concurrency capability, solve the single performance bottleneck problem. If you can handle the QPS from a database from a library, then 5 can handle 1w QPS, and the database scale-out is easy.
Sometimes, a single write performance is less than the business requirement when faced with a large number of write operations. If you are a double-master, you will encounter inconsistent database data, the reason is that the application of different users will be possible to operate two databases, while the update operation caused two database database data conflict or inconsistent. In a library, MySQL uses the storage Engine mechanism table lock and row lock to ensure data integrity, how to solve this problem in multiple main libraries? A master-slave replication management tool, called MYSQL-MMM (master-master replication managerfor mysql,mysql primary master Replication Manager) based on Perl language development, The greatest advantage of this tool is that only one database write operation is provided at the same time, which effectively guarantees data consistency.
Master-slave Copy blog: http://lizhenliang.blog.51cto.com/7876557/1290431
Read-Write Separation blog: http://lizhenliang.blog.51cto.com/7876557/1305083
Mysql-mmm Blog: http://lizhenliang.blog.51cto.com/7876557/1354576
4.2 Increasing the cache
Add the cache system to the database, cache the hot data into memory, and if the cache has the data to be requested, it will not go back to the database to return the results, improve read performance. The cache implementation has both local and distributed caches, and the local cache caches the data in the local server memory or in the file. Distributed cache can cache large amounts of data, extensibility, mainstream distributed cache system has memcached, redis,memcached performance stability, data cache in memory, fast, QPS up to about 8w. If you want to persist data, choose Redis, and the performance is no less than memcached.
Working process:
4.3 min Library
The Sub-Library is based on the business different to the related tables into different databases, such as the Web, BBS, blog and other libraries. If the volume of business is large, you can also take the post-segmentation library from the schema to further avoid a single library pressure too much.
4.4 Minute Table
VCDs increase in data volume, a table in the database has millions of of data, resulting in query and insertion time is too long, how to solve the single-table pressure? You should consider splitting the table into smaller tables to relieve the pressure on a single table and improve processing efficiency, which is called a sub-table.
Sub-table technology is troublesome, to modify the code of the SQL statements, but also manually to create other tables, you can also use the merge storage engine to implement the table, relatively simple and much more. After the table, the program is to operate on a general table, the total table does not hold data, only some of the relationship between the table, and the way to update the data, the general table according to different queries, the pressure on different small tables, so improve concurrency and disk I/O performance.
the Sub-table is divided into vertical split and horizontal split:
Vertical split: Split the original table with a number of fields into multiple tables, solving the table width problem. You can put infrequently used fields in a single table, or you can put large fields in a single table, or put a closely related field in a table.
Horizontal split: Split the original table into multiple tables, the structure of each table is the same, to solve the problem of large amount of single-table data.
4.5 Partitioning
Partitioning is a table of data based on the table structure of the fields (such as range, list, hash, etc.) into multiple chunks, which can be on a disk, or on a different disk, after partitioning, the surface is a table, but the data is hashed in multiple locations, so that, Multiple drives handle different requests at the same time, which improves disk I/O read-write performance and is relatively simple to implement.
Note: the addition of caches, sub-libraries, tables, and partitions is primarily implemented by program apes.
5. Database Maintenance
Database maintenance is the primary work of operations engineers or DBAs, including performance monitoring, performance analysis, performance tuning, database backup and recovery, and more.
5.1 Performance Status key indicators
Qps,queries per Second: Number of queries per second, number of queries a database can process per second
Tps,transactions per Second: Number of transactions processed per second
With show status, there are over 300 status information records, several of which help us calculate the QPS and TPS, as follows:
Uptime: The actual, unit seconds that the server has been running
Questions: Number of queries already sent to database
Com_select: Number of queries, actual operation of the database
Com_insert: Number of insertions
Com_delete: Number of deletions
Com_update: Number of updates
Com_commit: Number of transactions
Com_rollback: Rollback Count
So, the calculation method comes, based on the questions calculation of the QPS:
1 2 |
mysql> show global status like ‘Questions‘ ; mysql> show global status like ‘Uptime‘ ; |
QPS = Questions/uptime
The TPS is calculated based on Com_commit and Com_rollback:
1 2 3 |
mysql> show global status like ‘Com_commit‘ ; mysql> show global status like ‘Com_rollback‘ ; mysql> show global status like ‘Uptime‘ ; |
TPS = (com_commit + com_rollback)/Uptime
Another calculation method: Based on Com_select, Com_insert, Com_delete, com_update calculate the QPS
1 |
mysql> show global status where Variable_name in ( ‘com_select‘ , ‘com_insert‘ , ‘com_delete‘ , ‘com_update‘ ); |
Wait 1 seconds to execute, get the difference, the second time each variable value minus the first corresponding variable value, is the QPS
TPs Calculation method:
1 |
mysql> show global status where Variable_name in ( ‘com_insert‘ , ‘com_delete‘ , ‘com_update‘ ); |
The calculation of TPS, is not counted as a query operation, calculated to insert, delete, update four values can be.
By the Netizen to these two calculation way test obtains, when the database MyISAM the table compares the long time, uses the questions computation to be more accurate. When the InnoDB table in the database is relatively long, the com_* calculation is more accurate.
5.2 Turn on slow query log
MySQL turn on slow query log, analyze which SQL statement is slow, use Set set variable, restart service failure, you can add parameters in my.cnf permanent.
1 2 3 4 |
mysql>
set
global slow-query-log=on
#开启慢查询功能 mysql>
set
global slow_query_log_file=
‘/var/log/mysql/mysql-slow.log‘
;
#指定慢查询日志文件位置 mysql>
set
global log_queries_not_using_indexes=on;
#记录没有使用索引的查询 mysql>
set
global long_query_time=1;
#只记录处理时间1s以上的慢查询 |
Analyze the slow query log, you can use the MySQL Mysqldumpslow tool, the analysis of the log is relatively simple.
# mysqldumpslow-t 3/var/log/mysql/mysql-slow.log #查看最慢的前三个查询
can also use Percona Company's pt-query-digest tool, log analysis function is comprehensive, can analyze slow log, Binlog, general log.
Parsing slow query log: Pt-query-digest/var/log/mysql/mysql-slow.log
Analysis Binlog log: Mysqlbinlog mysql-bin.000001 >mysql-bin.000001.sql
Pt-query-digest--type=binlog Mysql-bin.000001.sql
Analyze normal log: Pt-query-digest--type=genlog localhost.log
5.3 Database Backup
Backup database is the most basic work, but also the most important, or the consequences are very serious, you know! However, because the database is large, hundreds of G, often backup is very time-consuming, so it should choose an efficient backup strategy, for the data volume of the database, generally use incremental backup. Common backup tools are mysqldump, Mysqlhotcopy, Xtrabackup, and so on, mysqldump comparison is suitable for small databases, because it is a logical backup, so backup and recovery time is longer. Mysqlhotcopy and Xtrabackup are physical backups, fast backup and recovery, no impact on database service, hot copy, recommended use of xtrabackup, support for incremental backups.
Xtrabackup Backup tool using blog post: http://lizhenliang.blog.51cto.com/7876557/1612800
5.4 Database Repair
Sometimes the MySQL server suddenly loses power and shuts down abnormally, causing the table to become corrupted and unable to read the table data. At this point, you can use the MySQL comes with two tools to repair, Myisamchk and Mysqlcheck.
Myisamchk: can only repair MyISAM table, need to stop database
Common parameters:
-F--force forced repair, overwriting old temporary files, generally not used
-R--recover Recovery mode
-Q--quik Fast Recovery
-A--analyze analysis table
-o--safe-recover old recovery mode, if-r cannot be repaired, you can use this parameter to try
-F--fast Only check for tables that are not properly closed
Quick fix Weibo database:
# Cd/var/lib/mysql/weibo
# myisamchk-r-Q *. MYI
Mysqlcheck:MyISAM and InnoDB tables are all available, do not need to stop the database, such as repairing a single table, you can add the table name after the database, separated by a space
Common parameters:
-A--all-databases Check all libraries
-R--repair Repair table
-C--check Check table, default options
-A--analyze analysis table
-O--optimize Optimization table
-Q--quik fastest check or repair table
-F--fast Only check for tables that are not properly closed
Quick fix Weibo database:
Mysqlcheck-r-q-uroot-p123 Weibo
5.5 Also, view the CPU and I/O performance methods
#查看CPU性能
#参数-P is the number of CPUs to display, all for all, or only the first few CPUs can be displayed
#查看I/O Performance
#参数-M is displayed in m units, the default K
#%util: When 100% is reached, I/O is busy.
#await: The request waits in the queue for a time that directly affects the read time.
I/O limit: IOPS (r/s+w/s), General RAID0/10 is around 1200. (IOPS, read/write (I/O) operations per second)
I/O Bandwidth: The theoretical value of SAS hard disk in sequential read/write mode is around 300m/s, and SSD drive theory value is around 600m/s.
These are some of the major optimizations that I've summed up with MySQL over the past three years, with limited capabilities and some less comprehensive, but these are basic to meeting the needs of small and midsize business databases.
Due to the design limitations of the relational database, some bat companies have put huge amounts of data into the relational database and have not been able to achieve better performance in large-volume data query and analysis. So NoSQL fire up, non-relational database, big data volume, with high performance, but also to compensate for the relational database in some aspects, and gradually most companies have put some of the business data inventory into NoSQL, such as MongoDB, HBase and so on. The data storage aspect uses the Distributed file system, such as HDFs, GFS and so on. Massive data analysis uses Hadoop, Spark, Storm, and more. These are related to the operation and maintenance of cutting-edge technology, but also in the storage of the main learning objects, small partners together to refuel it! Which Bo friends have a better solution, welcome to Exchange OH.
(turn) operation dimension on MySQL database optimization