A mature database architecture is not designed to have a high availability, high scalability, and so on, it is with the increase in user volume, the infrastructure is gradually improved. This blog post mainly discusses the MySQL database development period in the face of problems and optimization plan, aside from the front-end application is not said, roughly divided into the following five stages:
1. database table Design
After the project is set up, the development department develops the project according to the requirements of the Product department, and part of the development engineer Works is the design of the table structure. This is important for a database, which, if poorly designed, directly affects access speed and the user experience. There are many factors affecting, such as slow query, inefficient query, no proper indexing, database blocking (deadlock) and so on. Of course, a team of test engineers will do stress tests and find bugs. For a team without a test engineer, most development engineers will not be too early to consider the database design is reasonable, but to complete the implementation and delivery of functions as soon as possible, such as the project has a certain amount of traffic, hidden problems will be exposed, then it is not so easy to modify it.
2. Database deployment
The operational engineer came out, the initial visit to the project will not be very large, so the single deployment is sufficient to deal with the QPS (query rate per second) at around 1500. Taking into account the high availability, you can use MySQL master-slave replication +keepalived Double-click Hot standby, the common cluster software has keepalived, Heartbeat.
Two-Machine hot standby Blog: http://lizhenliang.blog.51cto.com/7876557/1362313
3, Database performance optimization
If the MySQL deployment to the ordinary X86 server, without any optimization, the MySQL theoretical value of the normal processing of 2000 or so QPS, after optimization, it is possible to upgrade to about 2500 QPS, otherwise, the amount of traffic when reached about 1500 concurrent connection, The database processing performance slows down and the hardware resources are rich, so it's time to consider software issues. So how do you maximize the performance of your database? On the one hand can run multiple MySQL instances of the server to maximize performance, on the other hand, the database is optimized, often operating system and database default configuration is more conservative, there will be a certain limit to the database, you can make appropriate adjustments to these configurations, as far as possible to handle more connections.
The specific optimization has the following three levels:
3.1 Database Configuration optimization
MySQL is commonly used in two storage engines, one is MyISAM, does not support transaction processing, read performance processing quickly, table-level locks. The other is InnoDB, which supports transaction processing (ACID), designed to maximize performance for processing large volumes of data, row-level locks.
Table Lock: Low overhead, large locking granularity, high probability of deadlock and lower relative concurrency.
Row Lock: Large overhead, small locking granularity, low deadlock probability, and high relative concurrency.
Why are table locks and row locks appearing? Mainly in order to ensure the integrity of the data, for example, a user in the operation of a table, other users also want to operate this table, then it is necessary to wait for the first user to operate, other users can operate, table lock and Row lock is the role. Otherwise, multiple users operating a table at the same time, the data will certainly produce conflicts or anomalies.
Depending on the above, using the InnoDB storage engine is the best choice and the default storage engine for later versions of MySQL5.5. There are a number of associated parameters per storage engine, and the following lists the main parameters that affect database performance.
Public parameter Default value:
Max_connections = 151
#同时处理最大连接数, it is recommended to set the maximum number of connections is about 80% of the upper bound
sort_buffer_size = 2M
#查询排序时缓冲区大小, only for order By and GROUP by function, you can increase this value to 16M
query_cache_limit = 1M
#查询缓存限制, only 1 m of the following query results will be cached, lest the result of large data cache pool coverage
query_cache_ Size = 16M
#查看缓冲区大小, used to cache the results of a select query, the next time the same select query will return results directly from the cache pool, and can multiply this value appropriately
open_files_limit = 1024
# Open the file limit and if show global status like ' open_files ' views values equal to or greater than the Open_files_limit value, the program will not be able to connect to the database or die
MyISAM parameter Default value:
Key_buffer_size = 16M
#索引缓存区大小, general set of physical memory 30-40%
read_buffer_size = 128K
#读操作缓冲区大小, recommended set 16M or 32M
InnoDB parameter Default value:
Innodb_buffer_pool_size = 128M
#索引和数据缓冲区大小, general set of physical memory 60%-70%
innodb_buffer_pool_instances = 1
#缓冲池实例个数, The recommended setting is 4 or 8
innodb_flush_log_at_trx_commit = 1
#关键参数, 0 represents approximately every second written to the log and synchronized to disk, database failure will lose 1 seconds of transaction data. 1 write to log after each execution of SQL and sync to disk, I/O overhead is high, execute SQL to wait log read and write, inefficient. 2 represents only the log to the system buffer, and then sync to disk per second, efficient, if the server fails, the transaction data will be lost. The data security requirements are not very high recommended settings 2, high performance, the modified effect is obvious.
innodb_file_per_table = Off
#默认是共享表空间, the shared tablespace idbdata files are growing, affecting certain I/O performance. It is recommended to open the independent table space mode, each table's index and data exist in their own separate tablespaces, you can implement a single table in different databases to move.
innodb_log_buffer_size = 8M
#日志缓冲区大小, because the log is the longest refresh per second, so generally no more than 16M
3.2 System Kernel Optimization
Most MySQL is deployed on Linux systems, so some of the operating system parameters also affect MySQL performance, and the following are optimized for the Linux kernel.
#TIME_WAIT超时时间, the default is 60s
Net.ipv4.tcp_tw_reuse = 1
#1表示开启复用, allowing time_wait The socket is reused for the new TCP connection, 0 means to turn off
net.ipv4.tcp_tw_recycle = 1
#1表示开启TIME_WAIT Socket quick Recycle, and 0 to close
Net.ipv4.tcp_max _tw_buckets = 4096
#系统保持TIME_WAIT The maximum number of sockets, if this number is exceeded, the system will randomly clear some time_wait and print a warning message
Net.ipv4.tcp_max_syn_ Backlog = 4096
#进入SYN队列最大长度, increase queue length to accommodate more waiting connections
In a Linux system, if the process opens a file handle that exceeds the system default value of 1024, the too many files open message is prompted, so adjust the open file handle limit.
# vi/etc/security/limits.conf #加入以下配置, * on behalf of all users, can also specify users, restart the system effective
* Soft nofile 65535
* Hard Nofile
# Ulimit-shn 65535 #立刻生效
3.3 Hardware configuration
Increase physical memory and improve file system performance. The Linux kernel allocates the buffer (system cache and data cache) from memory to store the hot data, which is synchronized to the disk by the file system deferred write mechanism, and so on, such as when the buffer size reaches a certain percentage or when the Sync command is executed. In other words, the larger the physical memory, the larger the allocation buffer, the more cache data. Of course, a server failure can lose a certain amount of cached data.
SSD drives replace SAS hard drives, adjust RAID levels to raid1+0, and have better read and write performance (IOPS) than RAID1 and RAID5, after all, the pressure on the database comes primarily from disk I/O.
4. Database schema Extension
With the increasing volume of business, a single database server performance has been unable to meet the business needs, the consideration of adding a machine, the cluster has to do ~ ~ ~. The main idea is to decompose the load of single database, break the disk I/O performance, store the heat data in the cache, and reduce the disk I/O frequency.
4.1 Master-slave copy and read-write separation
Because the production environment, most of the database is read operation, so the deployment of a main from the architecture, the main database is responsible for the write operation, and do double-backup, many from the database to do load balancing, responsible for reading operations, the mainstream load balancer has LVS, Haproxy, Nginx.
How to achieve the separation of read and write? Most enterprises are in the code level to achieve read-write separation, high efficiency. Another way through the proxy program to achieve read and write separation, the enterprise application is less, the common agent has MySQL proxy, amoeba. In such a database cluster architecture, the high concurrency ability of database is greatly increased, and the performance bottleneck of single station is solved. If from a database can deal with QPS from the library, then 5 can handle 1w QPS, the database lateral scalability is also very easy.
Sometimes, in the face of a large number of write-operation applications, the single write performance does not meet the business requirements. If you do a dual-master, you will encounter database data inconsistencies, the reason is that different users in the application will be able to operate two databases, while the update operation caused two database database data conflict or inconsistent. In the library when MySQL uses the storage Engine mechanism table lock and row lock to ensure data integrity, how to solve this problem in multiple main libraries? A master-slave replication management tool based on Perl language development, called MYSQL-MMM (master-master replication managerfor mysql,mysql Primary Replication Manager), The greatest advantage of this tool is that it provides only one database write operation at the same time, which effectively guarantees data consistency.
Master Copy Blog: http://lizhenliang.blog.51cto.com/7876557/1290431
Read and write separation blog: http://lizhenliang.blog.51cto.com/7876557/1305083
Mysql-mmm Bowen: http://lizhenliang.blog.51cto.com/7876557/1354576
4.2 Increase Cache
To the database to increase the caching system, the hot data cache into memory, if the cache has to request data will no longer go to the database to return results, improve read performance. Cache implementations have local caching and distributed caching, where the local cache caches data into local server memory or files. The distributed cache can cache the massive data, the expansibility is good, the mainstream distributed cache system has the memcached, the redis,memcached performance is stable, the data caches in the memory, the speed is very fast, the QPS can reach 8w or so. If you want data persistence, choose to use Redis, performance is not less than memcached.
According to the different business, the related tables are divided into different databases, such as Web, BBS, blog and so on. If the volume of business is large, you can also split the library from the structure, and further avoid a single library pressure too large.
VCDs increase in the amount of data, a table in the database has millions of data, resulting in query and insertion time is too long, how can solve the single table pressure? You should consider whether to split the table into smaller tables, to reduce the pressure on individual tables and to improve processing efficiency, which is called the sub-table.
Table technology is more cumbersome, to modify the program code in the SQL statements, but also manually to create other tables, can also use the merge storage engine implementation of the table, relatively simple many. After the table, the program is to operate on a total table, the total table does not hold data, only some of the relationship between the table, as well as the way to update the data, the total table will be based on different queries, the pressure to different small tables, thereby improving concurrency and disk I/O performance.
The table divides into vertical split and horizontal split:
Vertical split: The original table of many fields split into multiple tables to solve the table width problem. You can put infrequently used fields in a single table, or you can put large fields in a single table, or put a closely related field in a table.
Horizontal split: The original table split into multiple tables, the structure of each table is the same, to solve the problem of large amount of single table data.
Partitioning is a table of data based on the fields in the table structure (such as range, list, hash, etc.) into chunks, which can be on one disk, or on different disks, after partitioning, on the surface or a table, but the data is hashed in multiple locations, so that Multiple hard drives handle different requests at the same time, which improves disk I/O read and write performance and is simpler to achieve.
Note: Adding caches, libraries, tables, and partitions is achieved primarily by the program ape.
5. Database Maintenance
Database maintenance is the primary work of the operational engineer or DBA, including performance monitoring, performance analysis, performance tuning, database backup and recovery, and so on.
5.1 Performance Status Key indicators
Qps,queries per Second: Number of Queries/sec, number of queries a database can handle per second
Tps,transactions per Second: Number of transactions processed/sec
With show status to view the running state, there will be more than 300 state information records, of which several values help us to calculate QPS and TPs as follows:
Uptime: The actual, unit seconds that the server has been running
Questions: Number of queries sent to the database
Com_select: The number of queries, the actual operation of the database
Com_insert: Number of Inserts
Com_delete: Number of Deletes
Com_update: Number of updates
Com_commit: Number of transactions
Com_rollback: Rollback times
So, the method of calculation is to calculate the QPS based on questions:
Mysql> show global status like ' Questions ';
Mysql> show global status like ' Uptime ';
QPS = Questions/uptime
The TPS is calculated based on Com_commit and Com_rollback:
Mysql> show global status like ' Com_commit ';
Mysql> show global status like ' Com_rollback ';
Mysql> show global status like ' Uptime ';
TPS = (com_commit + com_rollback)/Uptime
Another calculation method: Based on Com_select, Com_insert, Com_delete, Com_update calculated QPS
Mysql> show global Status where Variable_name in (' Com_select ', ' com_insert ', ' com_delete ', ' com_update ');
Wait 1 seconds to execute, get the interval difference, the second time each variable value minus the first corresponding variable value, is QPS
TPs Calculation method:
Mysql> show global Status where Variable_name in (' Com_insert ', ' com_delete ', ' com_update ');
The calculation of TPS is not a query operation, calculated insert, delete, update four values can be.
Through the test of these two calculation methods, we can get the MyISAM table in the database, and use the questions calculation more accurately. When the InnoDB table is more in the database, the com_* calculation is more accurate.
5.2 Open Slow query log
MySQL open slow query log, analyze which SQL statement is slow, use Set variable, restart service failure, you can add parameters in my.cnf permanent.
Mysql> set global Slow-query-log=on #开启慢查询功能
mysql> set global slow_query_log_file= '/var/log/mysql/ Mysql-slow.log '; #指定慢查询日志文件位置
mysql> set global log_queries_not_using_indexes=on; #记录没有使用索引的查询
mysql> set global long_query_time=1; #只记录处理时间1s以上的慢查询
Analysis of slow query log, you can use the MySQL Mysqldumpslow tool, the analysis of the log is relatively simple.
# mysqldumpslow-t 3/var/log/mysql/mysql-slow.log #查看最慢的前三个查询
can also use the Percona Company's Pt-query-digest tool, the log analysis function is comprehensive, may analyze slow log, Binlog, general log.
Parse slow query log: Pt-query-digest/var/log/mysql/mysql-slow.log
Analysis Binlog log: Mysqlbinlog mysql-bin.000001 >mysql-bin.000001.sql
Analyze ordinary log: Pt-query-digest--type=genlog localhost.log
5.3 Database Backups
Backup database is the most basic work, but also the most important, otherwise the consequences are very serious, you know! But because the database is relatively large, hundreds of G, often backup is time-consuming, so the choice of a cost-effective backup strategy, for large data databases, generally using incremental backup. The common backup tools are mysqldump, Mysqlhotcopy, Xtrabackup, and so on, mysqldump are more suitable for small databases because they are logical backups, so backup and recovery takes longer. Mysqlhotcopy and Xtrabackup are physical backups, backup and recovery speed, do not affect the database service in the case of hot copy, recommended to use Xtrabackup, support incremental backup.
Xtrabackup Backup Utility Blog: http://lizhenliang.blog.51cto.com/7876557/1612800
5.4 Database Repair
Sometimes the MySQL server suddenly loses power and shuts down abnormally, causing the table to be corrupted and unable to read the table data. At this point, you can use MySQL's own two tools to repair, Myisamchk and Mysqlcheck.
myisamchk: only repair the MyISAM table, you need to stop the database
-F--force forced repair, covering old temporary files, generally not used
-R--recover Recovery model
-Q--quik Fast Recovery
-A--analyze analysis table
-o--safe-recover old recovery mode, if-r cannot be repaired, you can use this parameter to try
-F--fast Only check tables that are not properly closed
Quick fix Weibo database:
# myisamchk-r-Q *. Myi
Mysqlcheck:MyISAM and InnoDB tables can be used, do not need to stop the database, such as repair a single table, you can add the table name after the database, separated by a space
-A--all-databases Check all libraries
-R--repair Repair table
-C--check Checklist, default option
-A--analyze analysis table
-O--optimize Optimization table
-Q--quik The fastest check or repair table
-F--fast Only check tables that are not properly closed
Quick fix Weibo database:
5.5 In addition, view CPU and I/O performance methods
#参数-P is the number of CPUs displayed, all for all, or only the first few CPUs
#参数-M is displayed in M unit, the default K
#%util: When you reach 100%, I am busy with my I/O.
#await: The request waits in the queue, directly affecting the read time.
I/O limit: IOPS (r/s+w/s), generally raid0/10 around 1200. (IOPS, read/write (I/O) operations per second)
I/O bandwidth: In sequential read-write mode, the theory value of SAS hard disk is about 300m/s, SSD hard disk theory value is about 600m/s.
The above is I use MySQL three years to sum up some of the main optimization programs, limited capacity, some less comprehensive, but these basic to meet the needs of small and medium-sized enterprises database.
Due to the limitation of the original design of relational database, some bat companies have been put into relational database, which has not achieved better performance in mass data query and analysis. So NoSQL fire up, non-relational database, large data, with high performance, but also make up for a relational database in a certain aspect, gradually most companies have put some of the business data inventory into the NoSQL, such as MongoDB, HBase. Data storage uses Distributed file systems, such as HDFs, GFS, and so on. The calculation and analysis of massive data uses Hadoop, Spark, Storm and so on. These are related to the operation of the forefront of technology, but also in the storage of the main object of study, small partners together to refuel it! Which Bo friends have a better optimization program, Welcome to Exchange OH.
This article comes from the "Li Zhenliang Technology Blog" blog