I spoke about page optimization for website optimization. Now I want to talk about the optimization of the background database,
At present, a major bottleneck of the web program is the database swallow measurement. However, how can we determine that the bottleneck of the system is the database? It is necessary for us to optimize the system only when determining that the database is the bottleneck of the entire system. After all, there are so many requirements waiting for us to do it.
How can we determine that the database is a bottleneck?
1. If the program is well designed and there is a database operation logic layer, you can view the time spent by each request from the statistics at this layer. If the average time is no longer tolerable, the database is already a bottleneck.
2. Use the TOP command on the database server to check the resource usage of the MySQL server and the average load of the server. If the average server load is high and MySQL occupies 100% of the CPU resources, the MySQL server is very busy.
3. Run the iostat command on the database server to check disk I/O. If there are many block operations, the database operations are too frequent and the disk response is not urgent.
4. We recommend that you open the slow query log of MySQL. In this case, grep select will check the number of slow queries in the log. If the number is large, it indicates that the number of slow queries is large and needs to be adjusted.
5. If the database cannot be inserted one day, check whether the database table is too large. The maximum capacity of a table on a 32-bit operating system is 2 ^ 32.
However, we recommend that you add a logic layer for database operations, record the operation time before and after database operations, report statistics, and use monitoring programs to alert relevant owners, in this way, we can know that the database is a bottleneck and optimize it in advance.
Knowing that the database is a bottleneck, how can we optimize it?
1. Our first thought was to check whether the database capacity was too large. If the database table was too large, the index file would also be large, and each update operation would be more time-consuming. You need to consider database sharding and table sharding. Database/table sharding stores database records by partitioning according to certain rules. On the one hand, Server Load balancer can be implemented to evenly distribute requests and each segment can be borne by itself. On the other hand, database/table sharding allows us to store and operate more data. However, there is a certain risk that database/table sharding needs to be modified based on a single-database program. Therefore, the database/table sharding needs should be taken into account at the beginning of the program design. It is best to separate the database operation layer, easy to scale and change.
2. If the database table is not large but the query is slow, we need to check our SQL query statement and use the explain Statement of MySQL to see if an index is used. If no index is used, then we need to create an index on the corresponding fields, and use the explain statement repeatedly to find a suitable index. When creating an index, consider the following:
1) The less the database index, the better. Because each update requires an index, too many indexes will reduce the write speed.
2) the narrower field is placed on the left of the key, which improves the base of each vertex in the index and improves the index read/write performance.
3) try to avoid file sort sorting, temporary tables, and table scanning. For large tables, full table scanning will lead to a large number of disk Io operations, which will lead to very slow operations.
4) for large tables, try not to create indexes on string columns. String Matching is time-consuming and requires a high performance cost. If necessary, we recommend that you perform hash on string columns and then take an integer value for indexing.
3. If the update operation is slow and the read operation response does not need to be timely, you can use MySQL's Master/Slave hot backup to share the read/write pressure. After all, database operations are much less read-only. Therefore, we put the write operations on the database on the master server of MySQL. Using the hot backup of MySQL, we perform read operations on the backup database server, because multiple hot standby MySQL instances are available, read operations can be distributed across multiple hot standby instances to balance read operations and improve the performance of read operations.
4. caching is fundamental to all background programs. because 80% of requests correspond to 20% of data, we only need a small amount of memory to cache 20% of data, we can meet the needs of our system.
1) Add key cache, thread cache, and query cache as much as possible in MySQL settings
2) Add a general cache such as memcached to the application layer.
3) for a small amount of data but frequently operated tables, use the memory heap table provided by MySQL to achieve extremely high write and read speeds.
5. Database Design Optimization
We pay attention to the modeling paradigm for traditional database design to avoid data redundancy and lead to dirty data. However, in our actual application, we need to use rules of the third paradigm according to the situation. For data that frequently needs to appear in multiple places, for applications related to users, topics, and replies such as a forum, if we separate users from topics and replies, you need to perform a joint query on the user table and topic table or reply table every time you query an article or reply. If the data volume is small, the performance of the joint query is acceptable, if the table is larger than 3, 4, and more than 100,000 of the data, the join query speed will be slower. This paradigm needs to be implemented, but we still need to design our tables based on the situation, so as to reach a compromise between performance and good design.
Other words:
1. We recommend that you perform database operations in a hierarchical manner. There are at least two layers: the logical layer of database operations and the cache layer of databases. Considering this from the very beginning, we can easily divide and deploy databases and expand database/table sharding in the future.
2. Added MySQL monitoring, monitored MySQL slow query logs, and monitored MySQL requests.
3. Select the MySQL storage engine as needed.
MyISAM has a high read/write speed. However, because the table is locked, you cannot perform fast read and write at the same time.
InnoDB supports transactions and provides row-level locks. However, to use transactions, the tablespace is large and full-text indexing is not supported.
Heap puts the table into the memory, which is suitable for situations where the table is small and requires frequent operations, such as user information. It reads and writes quickly but is not persistent. You need to write your own tools to make it persistent.
4. Some MySQL server status detection commands
Show slave status: the master-slave synchronization is displayed.
Show [full] processlist
Show status: displays various MySQL Server requests.