Mysql database sharding application scenarios and design methods, mysql database sharding

Source: Internet
Author: User
Tags database sharding

Mysql database sharding application scenarios and design methods, mysql database sharding

Many of my friends asked me in the Forum and in the message area when mysql was required to perform database/table sharding and what design method was the best choice. Based on these questions, I have compiled examples of MySQL database/table sharding application scenarios and optimal design methods.

I. Sub-tables

Scenario: for large Internet applications, the number of records in a single database table may reach tens of millions or even hundreds of millions, and the database faces extremely high concurrent access. MySQL architecture in Master-Slave replication Mode,

Only the read capacity of the database can be expanded, while write operations on the database are still concentrated on the Master, and the number of Slave instances attached to a single Master cannot be limited, the number of Slave instances is limited by the Master capacity and load.

Therefore, the database throughput needs to be further expanded to meet the needs of high-concurrency access and massive data storage!

For a single table with extremely frequent access and massive data volumes, the first thing we need to do is to reduce the number of records in a single table, so as to reduce the time required for data query and improve the database throughput, this is the so-called Table sharding!

Before table sharding, you must first select an appropriate table sharding policy so that data cannot be evenly distributed to multiple tables without affecting normal queries!

For Internet enterprises, most of the data is associated with users. Therefore, user IDs are the most common table sharding fields. Because most queries require a user ID, this does not affect the query, and can make the data more balanced

Distribution to various tables (of course, some scenarios may also encounter uneven distribution of cold and hot data), such:

Assume that there is an order table that records User purchase information. Because the order table contains too many records, it will be split into 256 tables.

Split records are stored Based on the table obtained by user_id % 256, and the foreground application finds the table stored in the corresponding order for access based on the corresponding user_id % 256.

In this way, user_id becomes a required query condition. Otherwise, the data cannot be accessed because the data storage table cannot be located.

Note: after the split, the number of tables is generally 2 to the Npower, which is the origin of the 256 tables split above!

Assume that the order table structure is as follows:

create table order_(  order_id bigint(20) primary key auto_increment,  user_id bigint(20),  user_nick varchar(50),  auction_id bigint(20),  auction_title bigint(20),  price bigint(20),  auction_cat varchar(200),  seller_id bigint(20),  seller_nick varchar(50) ) 

After Table sharding, if user_id = 257 and auction_id = 100, you need to query the corresponding order information based on auction_id. The corresponding SQL statement is as follows:

select * from order_1 where user_id=257 and auction_id = 100; 

Here, order_1 is calculated based on 257% 256, indicating the first order table after table sharding.

Ii. Database sharding

Scenario: Table sharding can reduce the query efficiency caused by a large amount of data in a single table, but it cannot improve the concurrent processing capability of the database. In the face of highly concurrent read/write access, when the database master

When the server cannot carry the write operation pressure, no matter how the slave server is expanded, it makes no sense.

Therefore, we must use another method to split the database to improve the database write capability. This is called database sharding!

Similar to the table sharding policy, database sharding can use a keyword modulo to route data access, as shown in:

In the previous order table, if the value of the user_id field is 258, the original single database is divided into 256 databases, then the application's access request to the database will be routed to the second database (258% 256 = 2 ).

 

Iii. database/table sharding

Scenario: Sometimes the database may face both high-concurrency access pressure and massive data storage problems. In this case, both the table sharding policy and the database sharding policy must be adopted for the database, to expand the system at the same time

Concurrent processing capabilities, as well as improving the query performance of a single table, are called database/table sharding.

The database/table sharding policy is more complex than the previous database/table sharding or table sharding policy. A routing policy for database/table sharding is as follows:

1. Intermediate variable = user_id % (number of database shards * Number of tables in each database)

2. Database = INTEGER (intermediate variable/number of tables in each database)

3. Table = intermediate variable % number of tables in each database

Similarly, user_id is used as the routing field. First, user_id is used to modulo the number of databases * The number of tables in each database to obtain an intermediate variable. Then, the intermediate variable is used to divide by the number of tables in each database, get

While the intermediate variable modulo the number of tables in each database to obtain the corresponding table.

The detailed procedure of the database/table sharding policy is as follows:

Assume that the order of the original single database and single table is split into 256 databases, and each database contains 1024 tables. Then, according to the routing policy mentioned above, for access with user_id = 262145, the route calculation process is as follows:

1. Intermediate variable = 262145% (256*1024) = 1

2. Database = fetch an integer (1/1024) = 0

3. Table = 1% 1024 = 1

This means that the query and modification of the order record with user_id = 262145 will be routed to the 0th order_1 table in the 1st databases for execution !!!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.