The basic idea of the Sub-Library table
The basic idea of sharding is to divide a database into several parts and put it on a different database (server) to alleviate the performance problem of a single database. Not too strictly speaking, for a database of massive data, if it is because of more tables and more data, it is appropriate to use vertical segmentation, that is, the relationship between the close (such as the same module) of the table to split out on a server. If you don't have a lot of tables, but there's a lot of data for each table, then it's a good time to split the table's data into multiple databases (server) with some sort of rule, such as hash by ID. Of course, the reality is more of these two situations mixed together, this time need to choose according to the actual situation, may also be integrated use of vertical and horizontal segmentation, so that the original database into similar matrix can be unlimited expansion of the database (server) array.
The
needs to be specifically explained: when vertical and horizontal segmentation is performed simultaneously, the segmentation strategy can change subtly. For example: When only vertical segmentation is considered, the tables that are divided together can maintain any association between them, so you can divide the tables by "functional modules", but once the horizontal segmentation is introduced, the relationship between the tables will be greatly restricted. You can usually only allow one primary table (the table with the table ID to hash) and its multiple secondary tables to retain an association relationship. That is to say: when vertical and horizontal segmentation at the same time, in the vertical direction of the segmentation will no longer be "functional modules" division, but the need for more fine-grained vertical segmentation, and this granularity and domain-driven design of the The concept of "aggregation" coincides with, and can be said to be, exactly the same, and the main table of each shard is the aggregate root of an aggregation. This split down you will find that the database division is too fragmented (the number of Shard, but shard in the table is not many), in order to avoid the management of too many data sources, fully utilize the resources of each database server, you can consider the business close, and two or more shard with similar data growth rates (the amount of primary table data on the same order of magnitude) are placed in the same data source, each shard is still independent, they have their own primary table, and they hash with their primary table ID, except that their hash modulo (that is, the number of nodes) must be consistent. Common Sub-Library sub-table middleware Easy-to-use components: when sharding-jdbc Mushroom Street tsharding tough heavyweight middleware: sharding tddl Smart Client Way (Taobao) Atlas (Qihoo 360) Alibaba.cobar (developed by Alibaba) Mycat (research and development based on Ali Open source Cobar products) Oceanus (58 City Database Middleware) Oneproxy (Alipay chief architect Lou Fangxin Development) Vitess (database middleware developed by Google) issues to be addressed by the Library sub-table 1, transactional issues
There are two possible solutions to resolving transactional problems: Distributed transactions and a simple comparison of the two scenarios faced by the implementation transaction under the common control of the application and the database. Scenario One: Use distributed transaction benefits: to database management, simple and effective disadvantages: high performance costs, especially shard more and more. Scenario Two: The principle of common control between applications and databases: Splitting a distributed transaction across multiple databases into small transactions that are located just above a single database. And through the application to master the various small transactions. Advantages: There are advantages to performance: Applications need to be designed flexibly in transaction control. If you use spring's transaction management, it can be a challenge to change it. 2. Cross-node join problem
As long as the segmentation is done, the problem of cross node join is unavoidable. But good design and segmentation can reduce the occurrence of this kind of situation. The common practice in solving this problem is to implement the query in two times. The ID of the associated data is found in the result set of the first query, and the second request is based on these IDs to get the associated data. 3, Cross-node Count,order by,group by and aggregate function problems
These are a kind of problem because they all need to be computed based on all data sets. Most agents do not automatically process merge work. Solution: Similar to resolving cross node join problems, merging on the application side after results are obtained on each node. Unlike joins, queries for each node can be executed in parallel, so many times it is faster than a single large table. However, if the result set is large, the application memory consumption is a problem. 4, data migration, capacity planning, expansion and other issues
From the Taobao integrated business Platform team, it uses the multiples of 2 to have the characteristics of the forward compatibility (such as 4 more than 1 of the number of 2 to excess is 1) to allocate data, to avoid row-level data migration , but still need to do table-level migration , At the same time, the size of expansion and the number of tables are limited. Generally speaking, these plans are not very ideal, more or less have some shortcomings, this also reflected from a side of the difficulty of sharding expansion. 5. Business Distributed Transactions
Reference: [Research on distributed transaction, two-phase commit, one stage commit, best efforts 1PC mode and transaction compensation mechanism] (http://blog.csdn.net/bluishglc/article/details/7612811)
The advantages are based on two-stage submission, which guarantees the "atomicity" of Cross database operation, and is the most stringent transaction implementation method under distributed system. Simple implementation, small workload. Because most application servers and some independent distributed transaction coordinators do a lot of encapsulation work, the difficulty and the workload of introducing distributed transaction in the project can be neglected basically. Disadvantage system "horizontal" telescopic mortal enemies. Distributed transactions based on two-phase commit need to coordinate between multiple nodes when committing a transaction, minimizing the point at which the transaction is committed is an objective extension of the transaction's execution time, which results in a higher probability of conflicts and deadlocks in accessing shared resources, and as the number of database nodes increases, This trend will become more and more serious, thus becoming the system at the database level scale of the "shackles", which is the main reason many sharding systems do not use distributed transactions.
Best efforts 1PC-mode transactions
Refer to the implementation of SPRING-DATA-NEO4J. Given the performance advantages of best efforts 1PC mode, as well as its relatively simple implementation, transaction compensation (power equivalent) is used in most sharding frameworks and projects.
For those systems with high performance requirements but not high consistency requirements, the real-time consistency of the system is often not critical, as long as the final consistency can be achieved within a given time period, which makes the transaction compensation mechanism a feasible scheme. The transaction compensation mechanism was first proposed in the process of "long transaction", but it also has a good reference meaning for the distributed system to ensure the consistency. In general, unlike the way in which a transaction is rolled back immediately after an error in execution, transaction compensation is a measure of hindsight and remediation, and it only expects the result to be consistent within a permissible time period. The implementation of transaction compensation is closely related to the system business and does not have a standard processing method. Some common implementations are: reconciliation of data, alignment based on logs, regular synchronization with standard data sources, and so on. 6, ID problem
Once the database is split into multiple physical nodes, we can no longer rely on the primary key generation mechanism of the database itself. On the one hand, the ID generated by a partitioned database is not guaranteed to be unique globally, and the application needs to obtain an ID before inserting the data for SQL routing.
Some common primary key generation policy UUID
Using the UUID as the primary key is the simplest scenario, but the drawbacks are obvious. Because the UUID is very long, in addition to consuming a large amount of storage space, the main problem is that on the index, there is a performance problem in indexing and indexing queries. maintain a sequence table with database
The idea of this scheme is also very simple, in the database to create a sequence table, the structure of the table is similar to:
CREATE TABLE ' SEQUENCE ' (
' table_name ' varchar () not NULL,
' NextID ' bigint is not null,
PRIMARY KEY (' Tabl E_name ')
Engine=innodb
Whenever you need to generate an ID for a new record for a table, take the NextID of the corresponding table from the sequence table and update the NextID value by 1 to the database for the next time you want to use it. This scenario is also simpler, but the downside is also obvious: because all inserts require access to the table, the table can easily become a system performance bottleneck, and there is a single point where the entire application will not work once the table database is invalidated. Some people propose to use Master-slave for master-slave synchronization, but this can only solve the problem of single point, and can not solve the access pressure problem of read/write ratio 1:1. Twitter's distributed self-increasing ID algorithm snowflake
In the distributed system, the need to generate global uid or more, Twitter snowflake solve this demand, implementation is still very simple, remove configuration information, the core code is the millisecond time 41-bit machine ID 10-bit millisecond sequence 12 bits.
*---0000000000 0000000000 0000000000 0000000000 0---00000---00000---000000000000
In the above string, the first bit is unused (and actually can be used as a sign bit long), the next 41 bits are milliseconds, then the 5-bit datacenter, the 5-bit machine ID (not counting the identifier, which is actually the thread ID), and then 12 digits within the current millisecond within that millisecond, Add up to just 64 digits, a long type.
The advantage is that the overall order of time is by itself, and the entire distributed system will not produce an ID collision (by Datacenter and machine ID), and more efficient, tested, snowflake per second can produce around 260,000 IDs, fully meet the needs. 7, across the partition of the sorting page
Generally, pagination needs to be sorted by the specified field. When the sort field is the Fragment field, we can easily navigate to the specified fragment through the fragment rule, and the situation becomes more complicated when the sort field is not a fragment field. For the accuracy of the final results, we need to sort and return the data in different fragment nodes, and then summarize and reorder the result sets returned by different slices, and then return them to the user. As shown in the following illustration:
The simplest scenario described in the above picture (the first page of data) does not seem to have a significant impact on performance. However, if you want to remove the 10th page of data, the situation will become much more complex, as shown in the following figure:
Some readers may not understand why it's not as simple as getting the first page of data (sorted out the first 10 before merging, sorting). In fact, it is not difficult to understand, because the data in each fragment node may be random, in order to sort the accuracy of all the fragments of the first n page data must be sorted after the merge, and finally the overall ranking. Obviously, such operations are more resource-consuming, and the more the user pages, the worse the system performance will be.
Then how to solve the problem of paging in the context of the library? There are several ways to do this:
If it is in front of the application to provide pagination, the user can only see the first n page, this restriction is also reasonable in the business, generally look at the following page is not meaningful (if you must see, you can ask the user to narrow the scope of the query).
If it is a background batch task to obtain data in batches, you can increase page size, such as 5,000 records each time, effectively reduce the number of pages (of course, offline access to the general database to avoid impact of the main library).
In the design of the library, there is usually a large data platform to summarize all the records of the library, and some paging query can consider taking a large data platform. 8. Sub-Library Strategy
After the storage dimension is determined, how do you divide the records into individual libraries?
There are generally two ways: depending on the range of values, such as a user ID of 1-9999 records to the first library, 10000-20000 to the second library, and so on. Depending on the value modulo, such as user ID mod n, records with a remainder of 0 are placed in the first library, a remainder of 1 is placed in the second library, and so on.
Comparison of advantages and disadvantages:
Evaluation indicators According to the scope of the library in accordance with the MoD sub-Library
The number of early in the library quantity is small, can with the user/business on demand growth upfront, according to the mode factor to determine the number of libraries, the number is generally larger
Access Performance Early library quantity is small, the whole library query consumes less resources, the single library query performance slightly difference prophase storehouse quantity is big, the whole library query consumes the resources, the single library query performance is slightly good
Adjust the number of libraries is easier, generally need to add a library for new users, the old library split will only affect a single library difficulties, change the MoD factor causes data migration between all libraries
Data hotspot new and old users shopping frequency difference, there are data hot issues new and old users evenly distributed to each library, no hot
In practice, in order to deal with the simple, choose the MoD Library more. At the same time, two times, in order to facilitate data migration, is generally increased by multiples, such as the initial 4 libraries, two times split to 8, and 16. So for a library of data, half of the data moved to the new library, the remaining does not move, contrast each time only to add a library, all the data will be large-scale changes.
added, the MoD library generally each library records more evenly, but some databases, there is a super ID, these IDs records far more than other IDs, such as in the advertising scene, a large advertiser's advertising number may account for a large proportion of the total. If you take a modular library by the ad master ID, some libraries have a lot of records, and for these super IDs you need to provide a separate library to store the records. 9, the number of sub-library
The number of libraries and library can handle the number of records, in general, Mysql single library more than 50 million records, Oracle Single library more than 100 million records, DB pressure is very large (of course, the processing capacity and the number of fields/access mode/record length has a further relationship).
Under the above premise, if the number of sub-libraries is small, lack of decentralized storage and reduced DB performance pressures; If the number of libraries is large, the advantage is that each library has fewer records and library access performance is good, but for access across multiple libraries, the application needs to access multiple libraries, and if it is a concurrency pattern, consumes valuable thread resources ; If the serial mode, the execution time will increase dramatically.
Finally, the number of libraries also directly affect the input of hardware, generally each of the library run on a separate physical machine, more than one library means more than one device. So the specific number of libraries, to comprehensive assessment, the general initial database recommendations are divided into 4-8 libraries. 10, the route is transparent
In a sense, the library means that the DB schema has changed and will inevitably affect the application, but this change has nothing to do with the business, so try to ensure that the repository is transparent to the application code, and the library logic is handled as much as possible in the data access layer. Of course it's very difficult to do this completely, specifically which should be responsible for the DAL, which is responsible for the application, here are some suggestions:
For library access, such as a query condition that specifies a user ID, the SQL simply accesses a specific library. At this time should be automatically routed to the DAL layer to a specific library, when the library two split, but also modify the MoD factor, the application code is not affected.
For simple, multiple-Library queries, the DAL is responsible for summarizing the records returned by each database, while still applying transparency to the upper layer. 11, the use of the framework or independent research and development
There are relatively many sub-table middleware in the market, including MySQL proxy and amoeba based on proxy, Hibernate shards based on hibernate framework, and when SHARDING-JDBC based on JDBC, Based on the MyBatis-like MAVEN plug-and-Mushroom street tsharding, the Ibatis template class to rewrite spring is Cobar Client, each with its own advantages and short plates, Architects can choose between in-depth research and the actual situation of the project, but on the whole, I am cautious about the choice of the framework. On the one hand, most frameworks lack validation of successful cases, and their maturity and stability are questionable. On the other hand, some of the projects that are suitable for your project, from open source frameworks for successful commercial products (such as Ali and Taobao), require in-depth research and analysis by the architect. Of course, the final choice must be based on the characteristics of the project, team status, technology thresholds and learning costs, such as the comprehensive considerations determined.
Author: jackcooper
Link: http://www.jianshu.com/p/32b3e91aa22c
Source: Jane book
Copyright belongs to the author. Commercial reprint please contact the author to obtain authorization, non-commercial reprint please indicate the source.