In terms of scalability, customers' requirements become more and more, and 20, 50, or even more than 100 requirements appear on the function list. However, in general, we can shorten them to five categories and solve the scalability problem through five ways:
1. Adjust query operations
The optimization of queries will allow you to get the most results with the least effort. Make full use of the query function to meet business needs, and will not be overwhelmed by excessive traffic and heavy loads. This is why we often see customers encounter more and more troubles. As their website traffic grows, the scalability challenge becomes more and more serious. This is the problem. It is not necessary to optimize the query of the infrequently used pages in the corner of the website. Those pages will not receive traffic from the real world. It is common practice to make certain adjustments to network applications according to the reflection, and the results are very good.
To optimize the query, you must enable slow query logs and keep observing them. Use MK-query-Digest, a powerful tool in the maatkit to analyze logs. Make sure that the log_queries_not_using_indexes tag is set. Once you find that a query occupies resources heavily, You need to optimize it. Use the explain mechanism and Profiler to observe the index usage, create a missing index, and understand how it is added and sorted.
2. Use master-master Replication
Master-master's active-passive replication mode, or cyclic replication, not only brings high availability but also high scalability. This is because you can immediately allocate your application to a read-only slave disk. Many network applications are divided according to the 80/20 rule. 80% of the activities are used for reading or select, and the rest are allocated to insert and update. It is feasible to configure your application or re-Architecture and send the traffic required for reading to the slave disk. This type of horizontal scalability can be further extended, you can attach more Read-Only slave disks if necessary.
3. Use Storage
This sounds very basic and straightforward, but often overlooked. You should at least confirm the settings:
• Innodb_buffer_pool_size
• Key_buffer_size (MyISAM index cache)
• Query_cache_size-be careful when using large SMP
• Thread_cache & table_cache
• Innodb_log_file_size & innodb_log_buffer_size
• Sort_buffer_size, join_buffer_size, read_buffer_size, read_rnd_buffer_size
• Tmp_table_size & max_heap_table_size
4. Raid read from the disk
What is under your database? I don't know. Please find out. Are you using RAID 5? This is a huge obstacle to performance. The insert and update operations of RAID 5 are slow, and if you lose a hard disk, RAID 5 is almost powerless during reconstruction. RAID 5 is too slow, so what should I replace it? Use raid 10 as an image and segment to make full use of all the hard disks on your server or chassis. Even if your memory can accommodate the entire database, you still need to perform many read operations on the hard disk. Why? For example, sorting operations require you to reschedule the columns, groups, and connections, and add transaction logs. These operations are disk I/O operations.
5. Adjust Key Parameters
In addition, some additional parameters can also be used to improve performance:
Innodb_flush_log_at_trx_commit = 2
It can greatly improve the insert and update speeds, but it is a bit lazy when clearing the InnoDB log buffer. You can do more research on it, but in most cases it is very recommendable.
Innodb_file_per_table
InnoDB development is like Oracle, where the tablespace mode is used for storage. Obviously, kernel developers are not doing well, because the default settings of a separate tablespace will cause performance bottlenecks. This parameter setting helps InnoDB create tablespace and data files for each table, just like what MyISAM does.