Mysql horizontal segmentation

Source: Internet
Author: User
Tags field table md5 hash


In large and medium-sized projects, mysql horizontal segmentation usually splits the database or data table horizontally to reduce the pressure on a single database or table, considering the maximum data size of the database. Here I will introduce two common data table splitting methods in our projects. Of course, these methods use certain skills in the program to route to a specific table. First, we need to confirm based on what horizontal segmentation? In our system (SNS), the user's UID runs through the system, and the unique self-growth, according to this field table, it is better. Method 1: use MD5 hash to encrypt the UID using md5, and then retrieve the first few digits (Here we take the first two digits ), then we can hash different UIDs to different user tables (user_xx. Function getTable ($ uid) {$ ext = substr (md5 ($ uid), 0, 2); return "user _". $ ext;} with this technique, we can distribute different UIDs to the user_00, user_01 ...... User_ff. Because UID is a number and increments, according to the md5 algorithm, user data can be evenly divided into different user tables. However, there is a problem here: if more and more users are in our system, the data volume of a single table will inevitably increase, and the table cannot be extended according to this algorithm, this will return to the problem at the beginning of the article. Method 2: Use the shift method: www.2cto.com Php code public function getTable ($ uid) {return "user _". sprintf ("% 04d", ($ uid> 20);} Here, we move the uid 20 to the right, in this way, we can put about 1 million of the first user data in the first table user_0000, and the second 1 million of the user data in the second table user_0001. This keeps going, if we have more and more users, simply add a user table. Since the table suffix we keep is four bits, we can add 10 thousand user tables here, that is, user_0000, user_0001 ......

User_9999. There are 10 thousand million tables and 1 million data records in each table. We can store 10 billion user records. Of course, it doesn't matter if you have more user data than this. You just need to change the reserved table suffix to add scalable tables. If there are 100 billion data records, if you store 1 million data records in each table, you only need to keep the table suffix as 6 characters. The above algorithms can also be written flexibly: www.2cto.com Php code/*** according to UID table sharding algorithm * @ param int $ uid // user ID * @ param int $ bit // The table suffix retains several digits * @ param int $ seed/ /move the number of digits to the right */function getTable ($ uid, $ bit, $ seed) {return "user _". sprintf ("% 0 {$ bit} d", ($ uid >>$ seed);} conclusion: the above two methods are www.2cto.com, make the largest possible estimation of the user data volume in our current system, and estimate the maximum capacity of a single table in the database. For example, in the second solution, if we estimate that the number of users in our system is 10 billion, and the optimal data size for a single table is 1 million, we need to move the UID 20 to ensure that each table has 1 million data, the User table (user_xxxx) is reserved for expansion of 10 thousand tables.
Another example is the first solution. For each table, 1 million and the first two digits after md5 are obtained, there are only 256 tables. The total system database is 256*1 million; if the total data volume in your system is more than that, you must use MD5 to retrieve the first three or four or more digits. Both methods split data horizontally into different tables. Compared with the first method, the second method is more scalable...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.