Hash-based Mysql table sharding

Source: Internet
Author: User

The following describes the Hash algorithm used to process Mysql table shards. Before learning about the Hash algorithm, let's take a look at the Hash algorithm.

We know that a Hash table is a value calculated using a special Hash algorithm. This value must be unique and can be used to find the desired value, this is called a hash table.

The hash algorithm in the table is similar to this idea: Calculate the table name of the data storage table by the ID or name of the original target using a certain hash algorithm, and then access the corresponding table.

Continue to take the above post for example. Each post has a forum name and a forum ID. The two values are fixed and unique, then we can consider performing some operations on one of these two values to obtain the name of a target table.

Now, if we assume that the system allows a maximum of 0.1 billion data records for each table, and we want to save 1 million records for each table, the system will not have more than 100 tables. According to this standard, we assume that hash is performed on the Forum ID of the clipboard to obtain a key value. This value is our table name and then access the corresponding table.

We construct a simple hash algorithm:

Function get_hash ($ id ){
$ Str = bin2hex ($ id );
$ Hash = substr ($ str, 0, 4 );
If (strlen ($ hash) <4 ){
$ Hash = str_pad ($ hash, 4, "0 ");
}
Return $ hash;
}

The algorithm is generally to pass in a forum ID value, and then the function returns a four-character string. If the string length is not enough, use 0 to complete.

For example: get_hash (1), the output result is "3100", input: get_hash (23819), and the result is: 3233, then we use a simple combination with the table prefix, you can access this table. When we need to access the content with ID 1, the combined tables will be topic_3100 and reply_3100, so we can directly access the target table.

Of course, after the hash algorithm is used, some data may be in the same table. This is different from the hash table. The hash table tries its best to solve the conflict. We do not need it here, of course, the name of the table that may be stored in the Prediction and Analysis tables.

If more data needs to be stored, you can perform the hash operation on the Forum name. For example, the preceding binary conversion is also in hexadecimal format, because Chinese characters are much more likely to be repeated than numbers and letters, but more tables may be combined. Therefore, some other problems must be considered.

In the final analysis, if you use the hash method, You must select a good hash algorithm to generate more tables. However, data query is faster.

Internal lock of MySQL Server

Advantages of Mysql Merge table

Difference between 25 mysql and SQL server

Provides you with an in-depth understanding of Mysql temporary tables.

MySQL Show statement usage

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.