MySQL is a widely used database, but Web developers encounter problems in database expansion and performance during large-scale access. This is also one reason why NoSQL databases have emerged and become prosperous in recent years. For DBAs who focus on MySQL scalability, they certainly hope to see how websites like Twitter use and optimize MySQL.
Jeremy Cole and Davi Arnaut are members of Twitter DBAs and DB development teams. They said that most Twitter data uses MySQL as persistent storage, including published Tweets, interest graphs, and timelines, and user data. Due to the data size and access size of Twitter, they had to modify and optimize the MySQL source code to adapt to such applications. To give back to the community, they decided to open-source their contribution to MySQL under the BSD license. The project is hosted on GitHub. major modifications include:
Add status variables, especially in the InnoDB engine. In this way, the system load and running status can be monitored more effectively.
Optimize memory allocation in the non-consistent memory access system. The InnoDB cache pool is allocated during initialization. If the memory is insufficient, an error report can be obtained quickly. This ensures stable performance when the server is under memory pressure.
Reduce Unnecessary query timeout operations. In this way, the server can actively cancel long queries in milliseconds.
Imports and exports the InnoDB cache pool in a lightweight manner. This allows us to perform rollback at minimal cost.
Optimize ssd ssds, including page-flushing and reduce write operations to improve SSD hard drive life.
In addition, in April 12, Twitter will introduce Gizzard, a MySQL sharding replication framework in detail. If you have any questions, you can submit an issue to them on GitHub.