Three ways to divide MySQL tables

Last Update:2015-10-20 Source: Internet

Author: User

Tags crc32

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First of all, why divide the table
When a piece of data reaches millions of, you spend more time searching for it, and if you have a joint query, I think it's possible to die there. The purpose of the sub-table is to reduce the burden on the database and shorten the query time.
Based on personal experience, MySQL executes a SQL process as follows:
1, receive sql;2, put SQL into queue, 3, execute sql;4, return execution result. Where do you spend the most time in this process? The first is the time to wait in the queue, and the second, the SQL execution time. In fact, these two are one thing, waiting for the same time, there must be SQL in the execution. So we want to shorten the execution time of SQL.

There is a mechanism in MySQL is table locking and row locking, why this mechanism is to ensure the integrity of the data, I give an example, if there are two SQL to modify the same table of the same data, this time what to do, is not two SQL can simultaneously modify this data? It is clear that MySQL handles this situation in the form of a table lock (MyISAM storage engine) and a row lock (InnoDB storage engine). Table locking means that you can't operate on this table, and you have to wait until I finish working on the table. Row locking is the same, other SQL must wait until I'm done with this data before I can manipulate this piece of data. If there is too much data, the time to execute is too long, and the longer the wait, which is why we have to divide the table.

Table Method One:

Do MySQL cluster, for example: Using MySQL cluster, MySQL proxy,mysql replication,drdb, etc.
Some people will ask MySQL cluster, what is the root table related? Although it is not the actual point of the table, but it is the role of the sub-table, what is the significance of the cluster? To reduce the burden on a database, let's just cut down the number of SQL queued SQL, for example: There are 10 SQL requests, if placed in a database server queue, he will wait a long time, if the 10 SQL requests, allocated to 5 database server queue, There are only 2 queues in a database server, so is the wait time greatly shortened? It's already obvious. So I put it in the range of the table, and I did some MySQL clusters:
installation, configuration, and read/write separation of Linux MySQL proxy
MySQL replication is the main slave installation and configuration, and data synchronization
Pros: Good extensibility, no complex operations after multiple tables (PHP code)
Cons: The amount of data in a single table is still unchanged, and the time spent on one operation is still much, and hardware overhead is high.

Sub-table Method two:

Anticipate large data volumes and access to frequently-occurring tables, dividing them into a number of tables
This kind of forecast big difference, the forum in the post of the table, the time of the table is certainly very large, hundreds of thousands of, millions of is possible. Chat room inside information table, dozens of people together a chat one night, time is long, this table of data certainly very big. There are a lot of things like this. So this big data scale that can be estimated, we have to separate n table, this n is how much, depending on the actual situation. Take the chat information table as an example:
I built 100 such tables beforehand, message_00,message_01,message_02..........message_98,message_99. Then according to the user's ID to determine the user's chat information into which table, You can use the hash method to obtain, you can use the method of redundancy to obtain

Hash mode function get_hash_table ($table, $userid) {   $str = CRC32 ($userid);   if ($str <0) {   $hash = "0". substr (ABS ($STR), 0, 1);   } else{   $hash = substr ($str, 0, 2);   }      return $table. " _ ". $hash;  }     echo get_hash_table (' message ', ' user18991 ');     The result is message_10  echo get_hash_table (' message ', ' user34523 ');    The result is message_13

Modulo mode function hash_table ($table _name, $user _id, $total) {    return $table _name. ‘_‘ . (($user _id% $total) + 1);} Echo hash_table ("artice", 1234, 5); Artice_5echo hash_table ("Artice", 3243, 5); Artice_4

Explain, above this method, tells us user18991 this user's message all records in message_10 This table, user34523 This user's message all records in message_13 This table, reads, as long as reads from the respective table the line.
Pros: Avoid millions of of data in a single table, shortening the execution time of a SQL
Disadvantage: When a rule is determined, breaking this rule will be very troublesome, the above example, I use the hash algorithm is CRC32, if I do not want to use this algorithm, instead of MD5, will make the same user's message is stored in a different table, so the data is out of order. Extensibility is poor.

Sub-table Method three:

using the merge storage engine to implement the sub-table
I think this method is more suitable for those who did not think beforehand , and there has been a case of slow data query. This time if you want to separate the existing big data scale pain, the most painful thing is to change the code, because the SQL statement inside the program has been written, now a table to be divided into dozens of tables, even hundreds of tables, so that the SQL statement is to be rewritten? For example, I like to lift the child
mysql>show engines, you will find that Mrg_myisam is actually the merge.

mysql> CREATE TABLE IF not EXISTS ' user1 ' (with ' id ' int (one) ' Not NULL auto_increment, ' name ' varchar ( Default null, ' Sex ' int (1) NOT null default ' 0 ', PRIMARY KEY (' id '), Engine=myisam DE   FAULT Charset=utf8 auto_increment=1; Query OK, 0 rows affected (0.05 sec) mysql> CREATE TABLE IF not EXISTS ' User2 ' (-id ' int ') not NULL A Uto_increment, ' name ' varchar (+) default NULL, ' Sex ' int (1) NOT null default ' 0 ', Primar   Y KEY (' id ') Engine=myisam DEFAULT Charset=utf8 auto_increment=1;   Query OK, 0 rows affected (0.01 sec) mysql> INSERT into ' user1 ' (' name ', ' Sex ') VALUES (' Zhang Ying ', 0);   Query OK, 1 row Affected (0.00 sec) mysql> INSERT into ' user2 ' (' name ', ' Sex ') VALUES (' Tank ', 1); Query OK, 1 row Affected (0.00 sec) mysql> CREATE TABLE IF not EXISTS ' AllUser ' (-id ' int ') not NULL Auto_increment, ' name ' varchar (DEFAULT NU)LL, ' sex ' int (1) not NULL DEFAULT ' 0 ', INDEX (ID), Type=merge union= (user1,user2) Insert_me   Thod=last auto_increment=1;   Query OK, 0 rows affected, 1 Warning (0.00 sec) mysql> Select Id,name,sex from AllUser; +----+--------+-----+   | ID | name |   sex |  +----+--------+-----+   | 1 |   Zhang Ying |   0 |  | 1 |   Tank |   1 |   +----+--------+-----+ 2 rows in Set (0.00 sec) mysql> INSERT into ' alluser ' (' name ', ' Sex ') VALUES (' Tank2 ', 0);   Query OK, 1 row Affected (0.00 sec) mysql> Select Id,name,sex from User2; +----+-------+-----+   | ID | name |   sex |  +----+-------+-----+   | 1 |   Tank |   1 |  | 2 |   Tank2 |   0 |  +----+-------+-----+ 2 rows in Set (0.00 sec)

from the above operation, I do not know if you have found something? If I had a user table with 50W data, now split into two tables User1 and User2, 25W data per table,
insert into User1 ( User1.id,user1.name,user1.sex) SELECT (user.id,user.name,user.sex) from user where user.id <= 250000
Span style= "color: #333333;" >insert into User2 (user2.id,user2.name,user2.sex) SELECT (user.id,user.name,user.sex) from user where user.id > 250000
so I succeeded in dividing a user table into two tables, and this time there was a problem with the SQL statement in the code what to do, formerly a table, now two tables, The code changes very much, so that the programmer to bring a lot of work, there is no good way to solve this? The way is to back up the previous user table, and then delete, the above operation I created a alluser table, only the table name of the AllUser table is changed to user on the line. However, not all MySQL operations can use the
a, if you use ALTER TABLE to change the merge table to another table type, the mapping to the underlying table is lost. Instead, rows from the underlying MyISAM table are copied to the replaced table, which is then assigned the new type.
b, I've seen some of the things on the web saying that replace doesn't work, and I tried it to work. Halo a first

mysql> UPDATE alluser SET sex=replace (sex, 0, 1) where id=2;   Query OK, 1 row Affected (0.00 sec)   Rows matched:1  changed:1  warnings:0     mysql> select * from AllUser ;   +----+--------+-----+   | id | name   | sex |   +----+--------+-----+   |  1 | Zhang Ying |   0 |   |  1 | Tank   |   1 |   |  2 | Tank2  |   1 |   +----+--------+-----+   3 rows in Set (0.00 sec)

C, a merge table cannot maintain unique constraints on the entire table. When you execute an INSERT, the data goes into the first or last MyISAM table (depending on the value of the Insert_method option). MySQL ensures that the unique key value remains unique in that MyISAM table, but not across all the tables in the collection.
D, when you create a merge table, there is no check to ensure that the underlying table exists and has the same organization. When the merge table is used, MySQL checks that the record length of each mapped table is equal, but this is not very reliable. If you never create a merge table from a similar MyISAM table, you are very likely to bump into strange problems.
Good sleepy Sleep, C and D on the Internet to see, no test, everyone try it.
Advantages: Good extensibility, and the program code changes are not very large
Cons: This method is less effective than the second one

Summarize
Of the three methods mentioned above, I have actually done two kinds, first and second. The third one has not been done, so the finer point. Ha ha. Do what things have a degree, more than a degree to become very poor, can not blindly do database server cluster, hardware is to spend money to buy, do not blindly divide the table, separated to 1000 tables, MySQL storage in the final analysis also to the situation of the file on the hard disk, a table corresponding to three files, The 1000 sub-table is the corresponding 3,000 files, so the retrieval will also become very slow. My advice is to
Method 1 and Method 2 are combined in a way to perform the sub-table
Method 1 and Method 3 are combined in a way to perform the sub-table
My two suggestions are suitable for different situations, depending on the individual situation, I think there will be a lot of people choose Method 1 and Method 3 combined way

Address: http://www.blogjava.net/kelly859/archive/2012/06/08/380369.html

Three ways to divide MySQL tables

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More