Reference Links:
http://blog.csdn.net/winy_lm/article/details/50708493
I. Sub-table
Scenario: For large Internet applications, the number of rows in a database single table may reach tens or even billion, and the database faces very high concurrent access. MySQL architecture using Master-slave replication mode,
Only the reading of the database can be extended, and the writes to the database are concentrated on master, and the slave of a single master mount cannot be infinitely large, and the number of slave is limited by master capability and load.
Therefore, the throughput capability of the database needs to be extended further to meet the needs of high concurrent access and massive data storage.
The first thing we need to do is to reduce the number of records in a single table to reduce the time required for data queries and increase the throughput of the database, which is called the sub table.
Prior to the table, you first need to select the appropriate table strategy, so that the data can be more evenly divided into more than one table, and does not affect the normal query.
For an Internet enterprise, most of the data is associated with the user, so the user ID is the most commonly used table field. Because most queries need to take a user ID, this will not affect the query, but also can make the data more balanced
distributed to various tables (of course, some scenarios may also appear hot and cold data distribution is uneven), the following figure:
Suppose that there is a table that records the order form of the user's purchase information, which is split into 256 tables because of the number of record bars.
The split records are stored according to the user_id%256, and the foreground application finds the table that corresponds to the order store according to the corresponding user_id%256.
As a result, user_id becomes a required query condition, otherwise the data cannot be accessed because it cannot locate the table in which the data is stored.
Note: After splitting the number of tables is generally 2 of the N-time side, is the above split into 256 tables of origin.
Suppose the order table structure is as follows:
[SQL] view plain copy CREATE Table Order_ (order_id bigint primary key auto_increment, user_id bigint (m), User_nick varchar (a), auction_id bigint, auction_title bigint (a), price bigint, Auction_cat varchar (seller_id bigint), Seller_nick varchar (50))
Then after the table, assuming user_id = 257, and auction_id = 100, you need to query the corresponding order information according to AUCTION_ID, the corresponding SQL statement is as follows:
[SQL] view plain Copy select * from order_1 where user_id=257 and auction_id = 100;
Among them, order_1 is calculated according to 257%256, which represents the first order table after the table is divided.
Two. Sub-Library
Scenario: The table can solve the problem of the decrease of query efficiency caused by the large amount of single table data, but it can't bring the quality improvement to the concurrent processing ability of the database. Face high concurrent read and write access when the database master
When the server cannot host write-operation pressure, no matter how the slave server is extended, it is meaningless at this time.
Therefore, we must change a way of thinking, to split the database, thereby improving the database write ability, this is called the Sub-Library!
Like the sub-table strategy, a library can route data access by using a keyword modulo, as shown in the following illustration:
Or the previous order form, assuming that the value of the user_id field is 258 and the original library is divided into 256 libraries, then the application's access request to the database is routed to the second library (258%256 = 2).
three. Sub-Library table
Scenario: Sometimes the database may face both the pressure of high concurrent access and the storage of massive data, so it is necessary to use both the table and the sub-database strategy to extend the system's
Concurrent processing capabilities, as well as improving the query performance of a single table, this is called the Sub-Library table.
The strategy for the split-table is more complex than the previous one-only or the table-only strategy, and a routing strategy for the sub-table is as follows:
1. Intermediate variable = user_id% (number of tables per library)
2. Library = take integer (intermediate variable/number of tables per library)
3. Table = middle variable% number of tables per library
Also uses user_id as the route field, first uses the user_id to take a model to the number of library tables, obtains an intermediate variable, then uses the intermediate variable divided by the number of each library table, takes the whole, then obtains
The corresponding library, and the intermediate variables modulo the number of each library table, which is the corresponding table.
The detailed process for the sub-table strategy is as follows:
Assuming that the original single library order is split into 256 libraries, each containing 1024 tables, the route is calculated as follows for user_id=262145 access, as mentioned earlier in the routing strategy:
1. Intermediate variable = 262145 (256 * 1024) = 1
2. Library = rounding (1/1024) = 0
3. Table = 1% 1024 = 1
This means that queries and modifications to user_id=262145 order records will be routed to the 1th order_1 table in the No. 0 library for execution ...