Summarization of the algorithm of the database sub-table in the large-scale data storage other database

Source: Internet
Author: User
Tags hash md5

When a large amount of data is applied, we use a single table and library to store can seriously affect the speed of operation, such as MySQL MyISAM storage, we tested, under 200w, MySQL access speed is very fast, but if more than 200w of data, his access rate will be drastically reduced, Affect our WebApp access speed, and the amount of data is too large, if the use of a single table storage, it will make the system is quite unstable, MySQL service is easy to hang off. So when the amount of data is over 200w, it is recommended that the system engineer consider the sub table.

Here are a few common table-and-meter algorithms.

1. Sub-table/sub-Library according to natural time;

such as an application of data in a year after the amount of data will reach about 200w, then we can consider using the data for one year as a table or library to store, for example, the table name is app, then 2010 data is app_2010,app_2011; If the amount of data in one months to reach about 200w, then we can use the month to divide, app_2010_01,app_2010_02.

2. hash Table/Sub-Library by numeric type;

If we want to store the user's information, we apply a large amount of registration, we use a single table can not meet the storage requirements, then we can use the number of users to hash, common is to use the residual operation, if we want to divide 30 tables to store the user's information, then user number 1 user 1%30= 1, then we have user_01 table, if the user's number is 500, then 500%30=20, then we will store this user's information in the User_20 table.

3. According to the MD5 value to the table/sub-Library;

We assume that to store the user uploaded files, if the upload volume, will also bring system bottlenecks, we have done experiments, in a folder if more than 200 files, file browsing efficiency will be reduced, of course, this does not belong to the scope of our article discussion, This piece also wants to do hashing operation. We can use the file username to MD5 or use the file MD5 checksum, we can use the MD5 of the top 5 to do the hash, so that we can get 5^5=3125 table, each time in the storage file, You can use the first 5 bits of the MD5 value of the filename to determine which table the file should be stored in.

4. Example: A micro-blog URL encryption algorithm and storage strategy conjecture.

Now a lot of microblogs are using such URLs to access, if their domain name is www.example.com, then if you send micro-blog, you will find that the URL you sent into the HTTP://T.CN/MX4JA1, such a form, how they do this conversion? My guess is that we use the MD5 storage and lookup rules above, we use the URLs you send to MD5, get the MD5 value, and as we can see, we use the top 6 bits for the table.

5. The problems posed by the table.

The table will also bring a series of problems, such as paging implementation, the implementation of statistics, if we want to do a page of all the data, then we have to go through each of the tables, so the access efficiency will be very low. I tried to use MySQL agent to achieve, and finally with Tcsql to achieve.

6. The choice of the sub-table algorithm.

If your application data volume is not particularly large, it is best not to use the table.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.