First, let's talk about why the tables are divided.
When a piece of data reaches millions of, you spend more time searching for it, and if you have a joint query, I think it's possible to die there. The purpose of the sub-table is to reduce the burden on the database and shorten the query time.
Based on personal experience, MySQL executes a SQL process as follows:
1, received the SQL;
2, put SQL into the queue;
3, execute SQL;
4. Returns the result of execution.
Where do you spend the most time in this process? First, the time to wait in line; second, the execution time of SQL. In fact, these two are one thing, waiting for the same time, there must be SQL in the execution. So we want to shorten the execution time of SQL.
There is a mechanism in MySQL that is table locking and row locking, why this mechanism is to ensure the integrity of the data. Let me give an example, if there are two SQL to modify the same table of the same data, this time what to do, is not two SQL can modify this data at the same time?
It is clear that MySQL handles this situation in the form of a table lock (MyISAM storage engine) and a row lock (InnoDB storage engine). Table locking means that you can't operate on this table, and you have to wait until I finish working on the table. Row locking is the same, other SQL must wait until I'm done with this data before I can manipulate this piece of data. If there is too much data, the time to execute is too long, and the longer the wait, which is why we have to divide the table.
Second, sub-table
1, do MySQL cluster. For example: Using MySQL cluster, MySQL proxy,mysql replication,drdb, etc.
Some people will ask MySQL cluster, and the sub-table what is the relationship? Although it is not a practical sub-table, it plays a sub-table role. What does it mean to be a cluster? Reducing the burden on a database is simply to reduce the number of SQL queued SQL.
For example: There are 10 SQL requests, if placed in a database server queuing queue, he will wait for a long time, if the 10 SQL requests, allocated to 5 database server queue queue, a database server has only 2 queues, so wait time is greatly shortened? It's already obvious.
Pros: good extensibility, no complex operations after multiple tables (PHP code)
Cons: The amount of data in a single table is still unchanged, and the time spent on one operation is still much, and hardware overhead is high.
2, pre-estimated large data volumes and frequent access to the table, divided into a number of tables
This kind of forecast big difference, the forum in the post of the table, the time of the table is certainly very large, hundreds of thousands of, millions of is possible. Chat room inside information table, dozens of people together a chat one night, time is long, this table of data certainly very big. There are a lot of things like this. So this big data scale that can be estimated, we have to separate n table, this n is how much, depending on the actual situation. Take the chat information table as an example:
I built 100 such tables beforehand, message_00,message_01,message_02..........message_98,message_99. Then according to the user's ID to determine the user's chat information into which table. You can use a hash of the way to obtain, you can use to find the way to obtain, a lot of methods, each person to think of it. Here's a hash method to get the table name:
<?php
function get_hash_table($table,$userid) {
$str = crc32($userid);
if($str<0){
$hash = "0".substr(abs($str), 0, 1);
}else{
$hash = substr($str, 0, 2);
}
return $table."_".$hash;
}
echo get_hash_table(‘message‘,‘user18991‘); //结果为message_10
echo get_hash_table(‘message‘,‘user34523‘); //结果为message_13
?>
Explain, above this method, tells us user18991 this user's message all records in message_10 This table, user34523 This user's message all records in message_13 This table, reads, as long as reads from the respective table the line.
Pros: avoid millions of of data in a single table, shortening the execution time of a SQL
disadvantage: When a rule is determined, breaking this rule will be very troublesome, the above example, I use the hash algorithm is CRC32, if I do not want to use this algorithm, instead of MD5, will make the same user's message is stored in a different table, so the data is out of order. Extensibility is poor.
3, using the merge storage engine to achieve the sub-table
I think this method is more suitable, those who have not considered beforehand, and has already appeared, the data query slow situation. This time if you want to separate the existing big data scale is painful, the most painful thing is to change the code, because the SQL statement inside the program has been written. Now a table to be divided into dozens of tables, even hundreds of tables, so that the SQL statement is not to rewrite it? For example, I like to raise a child.
Mysql>show engines, you will find that Mrg_myisam is actually the merge.
Mysql>CREATE TABLE IF not EXISTS ' user1 ' (
-' id ' int (one) not NULL auto_increment,
-' Name ' varchar (DEFAULT NULL),
-' Sex ' int (1) not NULL DEFAULT' 0 ',
-PRIMARY KEY (' id ')
-) Engine=myisam DEFAULT Charset=utf8 auto_increment=1;
Query OK, 0 rows affected (0.05 sec)
Mysql>CREATE TABLE IF not EXISTS ' User2 ' (
-' id ' int (one) not NULL auto_increment,
-' Name ' varchar (DEFAULT NULL),
-' Sex ' int (1) not NULL DEFAULT' 0 ',
-PRIMARY KEY (' id ')
-) Engine=myisam DEFAULT Charset=utf8 auto_increment=1;
Query OK, 0 rows affected (0.01 sec)
Mysql>INSERT into ' user1 ' (' name ', ' Sex ') VALUES (' Zhang Ying ', 0);
Query OK, 1 row Affected (0.00 sec)
Mysql>INSERT into ' user2 ' (' name ', ' Sex ') VALUES (' Tank ', 1);
Query OK, 1 row Affected (0.00 sec)
Mysql>CREATE TABLE IF not EXISTS ' AllUser ' (
-' id ' int (one) not NULL auto_increment,
-' Name ' varchar (DEFAULT NULL),
-' Sex ' int (1) not NULL DEFAULT' 0 ',
-INDEX (ID)
-) Type=merge union= (user1,user2) insert_method=last auto_increment=1;
Query OK, 0 rows affected, 1 Warning (0.00 sec)
Mysql> Select Id,name,sex from AllUser;
+----+--------+-----+
| ID | name | sex |
+----+--------+-----+
| 1 | Zhang Ying | 0 |
| 1 | Tank | 1 |
+----+--------+-----+
2 rows in Set (0.00 sec)
mysql> INSERT into ' alluser ' (' name ', ' Sex ') VALUES (' Tank2 ', 0);
Query OK, 1 row Affected (0.00 sec)
Mysql> Select Id,name,sex from User2
;
+----+-------+-----+
| ID | name | sex |
+----+-------+-----+
| 1 | Tank | 1 |
| 2 | Tank2 | 0 |
+----+-------+-----+
2 rows in Set (0.00 sec)
From the above operation, I do not know you have found something? If I have a user table users, there are 50W data, now to be split into two tables User1 and user2, each table 25W data,
INSERT INTO user1(user1.id,user1.name,user1.sex)
SELECT (user.id,user.name,user.sex)FROM user where user.id <= 250000
INSERT INTO user2(user2.id,user2.name,user2.sex)
SELECT (user.id,user.name,user.sex)FROM user where user.id > 250000
So I succeeded in dividing a user table into two tables, this time having a question, what about the SQL statements in the code? Used to be a table, now become two tables, code changes are very large, so that the programmer to bring a lot of work, there is no good way to solve this?
The way is to back up the previous user table, and then delete, the above operation I created a alluser table, only the table name of the AllUser table is changed to user on the line. However, not all MySQL operations are available.
A, if you use ALTER TABLE to change the merge table to another table type, the mapping to the underlying table is lost. Instead, rows from the underlying MyISAM table are copied to the replaced table, which is then assigned the new type.
b, on the internet to see some say replace does not work, I tried to work. Dizzy One First
mysql> UPDATE alluser SET sex=replace (sex, 0, 1) where id= 2;
Query ok, 1 row affected ( 0. 00&NBSP;SEC)
rows matched: 1 changed: 1 warnings: 0
mysql> select * from AllUser;
+----+--------+--- --+
| id | name | sex |
+----+--------+-----+
| 1 | Zhang Ying | 0 |
| 1 | tank | 1 | &NBSP;&NBSP
| 2 | tank2 | 1 |
+----+--------+-----+
3 rows In set ( 0. 00&NBSP;SEC)
C, a merge table cannot maintain unique constraints on the entire table. When you execute an INSERT, the data goes into the first or last MyISAM table (depending on the value of the Insert_method option). MySQL ensures that the unique key value remains unique in that MyISAM table, but not across all the tables in the collection.
D, when you create a merge table, there is no check to ensure that the underlying table exists and has the same organization. When the merge table is used, MySQL checks that the record length of each mapped table is equal, but this is not very reliable. If you never create a merge table from a similar MyISAM table, you are very likely to bump into strange problems.
C and D on the Internet to see, no test, everyone try it.
Advantages: Good extensibility, and the program code changes are not very large
Cons: This method is less effective than the second one
Three, summarize
Of the three methods mentioned above, I have actually done two kinds, first and second. The third one has not been done, so the finer point. Ha ha. Do everything has a degree, more than a degree to become very poor, can not blindly do database server cluster, hardware is to pay to buy. Also do not blindly divide the table, separate to 1000 tables, MySQL storage in the final analysis also with the situation of the file on the hard disk, a table corresponding to three files, 1000 sub-table is the corresponding 3,000 files, so the retrieval will also become very slow. My advice is to:
Method 1 and Method 2 are combined in a way to perform the sub-table
Method 1 and Method 3 are combined in a way to perform the sub-table
My two suggestions are suitable for different situations, depending on the individual situation, I think there will be a lot of people choose Method 1 and Method 3 combined way.
3 ways to divide MySQL tables