In the previous article, I introduced several forms of the classification of the library and play, but also focused on the vertical library brought about by the problems and solutions. In this article, we will continue to talk about some of the techniques of the horizontal divide-and-sink table. The origin of slicing technology
Relational database itself is relatively easy to become a system performance bottleneck, single storage capacity, number of connections, processing capacity, etc. are very limited, the database itself "stateful" cause it is not like the Web and application server so easy to expand. In the Internet industry massive data and high concurrent access test, smart technicians put forward the sub-table technology (in some places also known as sharding, fragmentation). At the same time, the popular distributed system middleware (such as MongoDB, Elasticsearch, etc.) have their own friendly support sharding, the principle and the idea are similar. distributed globally unique IDs
In many small and medium projects, we often directly use the database to generate the characteristics of the primary key ID, which is quite simple. In the environment of the sub-database, the data is distributed on different slices and can not be generated directly from the growth of the database, otherwise, the primary key of the data table on different slices will be duplicated. A brief introduction to the use and understanding of several of the ID generation algorithm.
Twitter's snowflake (also known as the "Snowflake algorithm") Uuid/guid (supported by both general applications and databases) MongoDB ObjectID (similar to the UUID) Ticket Server (the way the database survives, This is the way Flickr is used.
Among them, Twitter's snowflake algorithm is the author in recent years in the Distributed System project most use, did not find duplication or concurrency problems. The algorithm generates a 64-bit unique ID (consisting of a 41-bit timestamp+ 10-bit custom machine code + 13-bit cumulative counter). This is not done too much, interested readers can access the relevant information.
Common fragmentation rules and strategies
How do I select a fragment field
Before we begin slicing, we first want to determine the fragment field (also known as the "Slice Key"). Many common examples and scenarios are split by ID or Time field. This is not absolute, my proposal is the actual business, through the system executed in the SQL statement statistical analysis, select the need to slice the table in the most frequently used, or the most important field as a fragment field. Common Fragmentation Rules
The common partitioning strategy is random fragmentation and continuous fragmentation, as shown in the following figure:
When it is necessary to use the Fragment field to search the range, continuous fragmentation can quickly locate the fragment for efficient query, and in most cases, it can effectively avoid the problem of cross fragment query. Later, if you want to expand the entire fragmented cluster, only need to add nodes, you do not need to migrate other fragmented data. However, continuous fragmentation can also be a problem of data hotspots, as in the graph of time-field fragmentation, some nodes may be more frequent query pressure, thermal data nodes become the bottleneck of the entire cluster. Some nodes may have historical data and rarely need to be queried.
Random fragmentation is not random, but also follow certain rules. In general, we use hash modulo to divide the partition, so sometimes it is called discrete fragmentation. The data of random fragment is relatively uniform, and the bottleneck of hot spot and concurrent access is not easy to appear. However, the late fragmentation cluster expansion requires the migration of old data. Using the consistent hash algorithm can avoid this problem to a large extent, so many middleware fragmentation clusters will adopt the consistent hash algorithm. Discrete fragmentation is also easily confronted with complex problems across partitioned queries. data migration, capacity planning, scalability issues
Few projects begin to consider fragmentation at an early stage, and are generally prepared in advance of the performance and storage bottlenecks of the business's rapid development. Therefore, it is unavoidable to consider the issue of historical data migration. The general practice is to read the historical data through the program, and then write the data to each fragment node according to the specified fragment rules.
In addition, we need to be based on the current data volume and QPS capacity planning, comprehensive cost factors, estimated how much fragmentation (generally recommend a single slice of the amount of single table data do not exceed 1000W).
If the use of random fragmentation, you need to consider the later expansion of the problem, relatively difficult. If the scope of the use of fragmentation, only need to add nodes can be automatic expansion. cross-slicing technical issues sorting pagination across slices
Generally, pagination needs to be sorted by the specified field. When the sort field is the Fragment field, we can easily navigate to the specified fragment through the fragment rule, and the situation becomes more complicated when the sort field is not a fragment field. For the accuracy of the final results, we need to sort and return the data in different fragment nodes, and then summarize and reorder the result sets returned by different slices, and then return them to the user. As shown in the following illustration:
The simplest scenario described in the above picture (the first page of data) does not seem to have a significant impact on performance. However, if you want to remove the 10th page of data, the situation will become much more complex, as shown in the following figure:
Some readers may not understand why it's not as simple as getting the first page of data (sorted out the first 10 before merging, sorting). In fact, it is not difficult to understand, because the data in each fragment node may be random, in order to sort the accuracy of all the fragments of the first n page data must be sorted after the merge, and finally the overall ranking. Obviously, such operations are more resource-consuming, and the more the user pages, the worse the system performance will be. function processing across fragments
When statistics and calculations are performed using functions such as Max, Min, Sum, and count, you need to perform the corresponding function processing on each of the fragmented data sources before the result sets are processed two times, and the processing results are eventually returned. As shown in the following illustration:
Cross fragment Join
Join is the most common feature in relational databases, but in a fragmented cluster, joins become very complex. You should try to avoid cross slicing join queries (this scenario is more complex and has a significant performance impact than the cross fragment paging above). There are usually several ways to avoid this: Global tables
The concept of a global table was previously mentioned in the "Vertical repository". The basic idea is that some of the table information similar to the data dictionary that may produce join queries is placed in each slice, thus avoiding the join that spans the fragment. er Fragmentation
In relational databases, there are relationships between tables that are often associated. If we can determine the relationship first and put the table records that are associated with each other on the same fragment, then it is good to avoid the problem of cross fragmentation join. In the case of a one-to-many relationship, we usually choose to divide by the party with the more data. As shown in the following illustration:
As a result, the order form above the data Node1 can be directly associated with the order detail form, and the local join query is done, as are the data Node2. This approach, which is based on ER fragmentation, can effectively avoid cross fragmentation join problems in most business scenarios. Memory Calculation
With the rise of spark memory computing, many operational problems across data sources seem to be resolved in theory. The data can be dropped to the spark cluster for memory calculation, and the result is returned. cross-Fragmentation transaction issues
With distributed transactions across fragmented transactions, it is necessary to understand the "XA interface" and "two-phase commit" to understand distributed transactions. It is worth mentioning that the XA support in mysql5.5x and 5.6x is problematic and can result in inconsistent master-slave data. It was not repaired until the 5.7x version. Java applications can use the Atomikos framework to implement XA transactions (JTA in EE). Interested readers can refer to the Distributed Transaction consistency solution by themselves, linked address:
Http://www.infoq.com/cn/articles/solution-of-distributed-system-transaction-consistency does our system really need a sub-database?
Read the above content, can not help to cause some readers to think, our system needs to divide the table.
In fact, there is no clear standard of judgment, more reliance on actual business situations and experience judgments. According to the author's personal experience, the general MySQL single table 1000W data is no problem (the premise is the application system and database design and optimization of the better). Of course, in addition to considering the current data volume and performance situation, as the architect, we need to consider the system in advance of six months to a year or so of business growth, the database server QPS, connectivity, capacity, and so do a reasonable assessment and planning, and to do the appropriate preparatory work in advance. If the stand-alone machine is not satisfied and it is difficult to optimize from other aspects, it is necessary to consider fragmentation. This situation allows you to first remove the ID from the database and prepare for the fragmentation and subsequent data migration work ahead of time.
Many people feel that the "divide-and-sink" should be done sooner rather than later, as it is feared that the faster the company's business grows, the more complex the system becomes, and the more difficult it is to restructure and expand the system. This sounds like a bit of a reason, but my point is the opposite, for relational databases, I think "can be fragmented without fragmentation," unless the system really needs, because database fragmentation is not low-cost or free.
Here I recommend a more reliable transition technology – "table partitioning." The mainstream relational database is basically supported. The different partitions are logically still a table, but they are physically separate, improve query performance to some extent, and are transparent to the application without modifying any code. The author has been responsible for optimizing a system, the main business table has about 8000W of data, taking into account the cost, at that time is the use of "table Partition" to do, the effect is more obvious, and the system runs very stable. Summary
Finally, there are a lot of readers want to know the current community there is no open source free Library table solution, after all, standing on the shoulders of giants can save a lot of effort. Currently there are two main types of solutions:
Application-level based Ddal (distributed database access layer)
More typical is the TDDL Taobao half open source, Dangdang Open source Sharding-jdbc and so on. The Distributed data Access layer requires no hardware input, and large companies with strong technical capabilities usually choose to develop and customize them two times from the development or reference of open source frameworks. The intrusion of the application is generally large, which increases the cost and complexity of the technology. Typically, only specific programming language platforms (most of the Java platforms) are supported, or only specific databases and specific data access framework technologies are supported (general support for MySQL databases, JDBC, MyBatis, Hibernate, and other framework technologies).
Database middleware, more typical like Mycat (in Ali Open source Cobar based on a lot of optimization and improvement, belong to an Up-and-comer, also support many new features), based on the go language to achieve kingsharding, compared to the Old Atlas (from 360 open source) and so on. These middleware are used in a large number of Internet enterprises. In addition, the official fabric components of the MySQL 5.x Enterprise Edition are also known to support slicing technology, although fewer enterprises are used domestically.
Middleware can also be called "Transparent gateway", the famous mysql_proxy is probably the originator of the field (provided by the MySQL official, limited to achieve "read and write Separation"). The middleware generally realizes the network communication protocol of the specific database, simulates a real database service, masks the backend real server, and the application usually connects the middleware directly. While executing the SQL operation, the middleware will parse and route the SQL statement according to the predefined fragment rules, and then make two calculations to the result set and return it eventually. The technology cost of introducing database middleware is lower, and the intrusion of the application is almost no, it can satisfy most of the business. Additional hardware inputs and operational costs are added, while the middleware itself has a performance bottleneck and a single point of failure, and needs to be able to ensure the high availability and scalability of the middleware itself.
In short, whether the use of distributed data access layer or database middleware, will bring a certain degree of cost and complexity, there will be a certain performance impact. Therefore, readers need to consider and choose carefully according to the actual situation and business development. Author Introduction
Ding Lang, technical architect. Focus on high concurrency, highly available architecture design, in-depth research and rich practical experience on system service, sub-Library, performance tuning. Keen on technical research and sharing.
--This article is from: http://blog.csdn.net/dinglang_2009/article/details/53195871