MySQL: open-source database Sharding technology [figure] from Shard to Sharding
The word "Shard" refers to "fragmentation" in English. as a database-related technical term, it seems to have been first seen in massively multiplayer online role-playing games (MMORPG. "Sharding" is called "Sharding ".
Sharding is not a new technology, but a relatively simple software concept. As you know, MySQL 5 was used to partition data tables. Before that, many potential MySQL users concerned about MySQL scalability, whether the partition function is available is a key indicator (of course not the only indicator) for measuring the scalability of a database ). Database scalability is an eternal topic. MySQL promoters are often asked: how does one implement partitioning to process application data in a single database? The answer is Sharding.
Sharding is not a function attached to a specific database software, but an abstract processing based on specific technical details. it is a horizontally scalable (Scale Out) solution, its main purpose is to break through the I/O capability limitations of single-node database servers and solve Database scalability problems.
Database Scalability
Speaking of Database Scalability, this is a big topic. Currently, commercial data has its own scalability solutions, which are relatively mature in the past. However, with the rapid development of the Internet, it will inevitably lead to the evolution of some computing models, in this way, many mainstream business systems will inevitably expose some shortcomings. For example, Oracle RAC uses a shared storage mechanism. for I/O-intensive applications, the bottleneck is easily stored. Such a mechanism determines that subsequent resizing can only be Scale Up (Up) the hardware cost, developers' requirements, and maintenance cost are relatively high.
Sharding is basically a scalable solution for open-source databases. few people have heard of Sharding for commercial databases. The current industry trend is basically to embrace Scale Out and gradually release from Scale Up.
Sharding application scenarios
Any technology can play its due role in a suitable situation. The same is true for Sharding. Online games, IM, and BSP are suitable for Sharding application scenarios. In common, abstract data objects have very small data associations. For example, IM, if each user is abstracted into a data object, it can be stored independently in any place, and the data object is Share Nothing. for example, the content of the website of the Blog service provider, basically, the content generated by users (UGC) can be isolated from different users to different storage sets, which is transparent to users.
This "Share Nothing" is a concept borrowed from a database cluster. for example, some types of data are not "Share Nothing" in granularity, such as historical table information similar to transaction records, if a record contains both seller information and buyer information, the buyer and seller will continue the transaction with other users over time, in this way, the information of the two buyers and sellers is inevitably distributed to different Sharding databases. if you query the sellers, the overhead will be larger.
Sharding is not a silver bullet in the database expansion solution, but also has some unsuitable scenarios. for example, transaction-oriented applications are very complicated. For transactions across different databases, it is difficult to ensure integrity, not worth the candle. Therefore, the Sharding format is not rigid.
Sharding and database Partition
Sometimes, Sharding is similar to Horizontal Partitioning. in many places on the Internet, Horizontal Partitioning is also used to refer to Sharding, but I personally think there is actually a difference between the two. Indeed, Sharding comes from partitioning, But database partitions are basically processed at the data object level, such as partitions of tables and indexes, each sub-dataset can have different physical storage attributes or operations within a single database, while Sharding can span databases or even physical machines. (See the comparison table)