With the advent of the Big Data era, the growing volume of data is impacting our systems, and many fragile systems have been overwhelmed or even overwhelmed by the onslaught of data flooding. With the rapid development of computer hardware, gigabit, gigabit network card, optical fiber, SSD hard disk, DDR4 and so on the emergence of the latest hardware, computer hardware performance is no longer the important focus of our system optimization, and slowly we found that most of the current web system performance bottlenecks are from the database.
Pre-system even if you provide more Web application server, buy a good load balancer device, the fastest backbone line is still under the performance of the database "pit Dad", the page is stuck to die, query a data for a long time ....
So optimizing our database is the cheapest and most effective way, can make our system performance greatly improve, the method of optimizing database is many, for example: Use SSD SSD to store core data, increase database cluster and so on. One of the following is to introduce the "sub-table" optimization, this optimization is relatively simple, in the database design period can be reasonable planning.
Table optimization is simply to split the "big table" into a "small table", where the "big" has two dimensions meaning:
table field Horizontal dimension: More table fields, and one even dozens of field.
Table Content Vertical dimension: Table data volume is large, millions of, tens of millions of or even billions of data.
So the table optimization will be carried out in two directions simultaneously, vertically and horizontally.
In general, we execute an SQL statement in the following order:
1. The client sends SQL through the connection to the database server.
2. The database server parses the SQL statement and does a series of processing.
3. Execute the SQL statement.
4. Return the execution results to the client.
Operations such as updae,delete can cause a certain degree of "lock", depending on the engine, the isolation level setting may appear "lock field", "Lock Line", "lock table" and even "lock the library" and so on.
The result is a long SQL execution time, waiting for SQL in the query queue can not continue, resulting in a significant increase in system operation time and other undesirable consequences.
We need a reasonable table to deal with the "lock".
let's look at "vertical sub-tables" First:
It is said that MySQL can support 1000 fields, however often we do not use and do not recommend so much, generally recommended the total number of fields not more than 30, 10-20 more reasonable, and then more words in the associated query will cause the efficiency of waste. Of course, this is not absolute, can be implemented according to the project and specific circumstances.
The following is a user table that is used only as an example and does not have to consider its specific design and content rationality:
The following is a SQL statement that is built as a table:
SQL code
- CREATE TABLE 'USER ' (
- ' usr_id ' int (one) not NULL auto_increment COMMENT ' user number ',
- 'name ' varchar ( DEFAULT NULL COMMENT ' name '),
- ' Login_name ' varchar (+) not NULL COMMENT ' login name ',
- ' PASSWD ' varchar (+) not NULL COMMENT ' password ',
- ' custorm_id ' int (one) DEFAULT NULL COMMENT ' client code ',
- ' Status ' int (one) DEFAULT NULL COMMENT ' user state ',
- ' Safe_question ' int (one) DEFAULT NULL COMMENT ' Security question ',
- ' Safe_answer ' varchar (+) DEFAULT NULL COMMENT ' security prompt answer ',
- ' Create_date ' datetime DEFAULT NULL COMMENT ' created Date ',
- ' Activ_code ' varchar (+) DEFAULT NULL COMMENT ' Activation Code ',
- ' Usr_type ' int (one) DEFAULT NULL COMMENT ' user type ',
- ' Mobile ' int (+) DEFAULT NULL COMMENT ' phone ',
- ' QQ ' int (+) DEFAULT NULL COMMENT ' QQ ',
- ' Address ' int (COMMENT) DEFAULT NULL ' contact address ',
- ' Age ' Int (ten) DEFAULT NULL COMMENT ' ages ',
- PRIMARY KEY (' usr_id ')
- );
This table content information rich can be very comprehensive query out all the user information, so we still need to use the table to optimize it? The answer is yes, it is necessary!
It is very easy to get all the information in a table with a small amount of traffic, which brings great convenience to the development of our code, but once the user volume surges, this kind of table structure will be a serious criticism, such as: Login and update user information at the same time, it is easy to lead to lock table, mutual influence , as well as changing passwords and modifying other information. In order to deal with this high concurrency situation in the future, we should reasonably design the library table.
First, we can extract the user security information, such as: password, login name, security questions, security prompt answer. This information is composed of a table called "PASSWD", the advantage is that the login and related security information independent for a single function such as "login" business calls, so that you can reduce the pressure of some of the table, log in only to "PASSWD" query to complete the operation, When it is necessary to query other information, this also has another advantage, that is, you can passwd view and modify permissions fixed, only a certain MySQL users can operate to improve security.
Secondly, we will be the user's relevant contact information extracted, such as: name, contact address, age, mobile phone, QQ. This information is composed of a table called "User_info", so that we can easily expand the table's fields and information, more conducive to future updates and maintenance.
Finally, the remaining fields make up the new "user" table, which is only responsible for storing fields related to the user's business, so that the "user" table is more closely related to the business, and the three tables are associated with the field "usr_id".
After the completion of the table is as follows:
To build a table SQL statement:
SQL code
- CREATE TABLE 'USER ' (
- ' usr_id ' int (one) not NULL auto_increment COMMENT ' user number ',
- ' custorm_id ' int (one) DEFAULT NULL COMMENT ' client code ',
- ' Status ' int (one) DEFAULT NULL COMMENT ' user state ',
- ' Create_date ' datetime DEFAULT NULL COMMENT ' created Date ',
- ' Activ_code ' varchar (+) DEFAULT NULL COMMENT ' Activation Code ',
- ' Usr_type ' int (one) DEFAULT NULL COMMENT ' user type ',
- PRIMARY KEY (' usr_id ')
- );
- CREATE TABLE ' PASSWD ' (
- ' usr_id ' int (one) not NULL COMMENT ' user number ',
- ' Login_name ' varchar (+) not NULL COMMENT ' login name ',
- ' PASSWD ' varchar (+) not NULL COMMENT ' password ',
- ' Safe_question ' int (one) DEFAULT NULL COMMENT ' Security question ',
- ' Safe_answer ' varchar (+) DEFAULT NULL COMMENT ' security prompt answer '
- );
- CREATE TABLE ' User_info ' (
- ' usr_id ' int (one) not NULL COMMENT ' user number ',
- 'name ' varchar ( DEFAULT NULL COMMENT ' name '),
- ' Address ' int (COMMENT) DEFAULT NULL ' contact address ',
- ' Age ' Int (ten) DEFAULT NULL COMMENT ' ages ',
- ' Mobile ' int (+) DEFAULT NULL COMMENT ' phone ',
- ' QQ ' int (+) DEFAULT NULL COMMENT ' qq '
- );
Of course, the above table is not absolute, can be based on the specific situation to consider the allocation method and implementation, I am here just to provide a train of thought, do not have to strict requirements of the rationality of the above-mentioned method, the benevolent see.
Summary: The vertical sub-table is suitable for a large number of fields and business associations, the fields are dispersed into different business-related tables maintained separately, so that different businesses do not affect each other or constraints.
Another form of table is the "Horizontal table":
The vertical sub-table applies to the field, and the horizontal table is applied to the table contents.
We know that once the table has a lot of content, the query will be slow, although indexing can reduce the query time, but this is not the fundamental solution. Although MySQL does not rigidly limit the size of a single table, anyone knows that "table capacity cannot grow indefinitely, and this affects performance".
We can enter the command "Show Table status" in the SQL console to see the size and status of the table.
The idea of a horizontal table is to split a high-load table into smaller tables to improve performance, such as a forum table structure.
It is well known that a forum corresponds to a lot of sub-forums, sub-forum and corresponding to a lot of posts, the Post also corresponds to a lot of replies, structure such as:
Forums < sub-Forum Forums < Posts topic < reply reply
We can make a distinction between forums by forum ID, such as: forum_001,forum_002,forum_003 ....
When querying, we use specific algorithms to stitch table names to operate.
Of course, this is a more extreme approach, this practice is also very restrictive, once the table structure changes will produce a terrible change in the workload, the horizontal table is similar to the "sub-table", we will next learn to differentiate the table related content.
Horizontal sub-table first need to solve the "primary key" problem, when all the data in the same table, we can use the internal mechanism of MySQL auto_increment very easy to implement the primary key self-increment, but the horizontal table after each table is maintained a primary key generation strategy, This creates a recurring conflict, so we first need to resolve the primary key generation before we divide the table.
Here are some solutions:
1. Set Primary key start value
Join we have divided the table horizontally into 3 tables, table1,table2,table3, each table holds 100,000 pieces of data, then TABLE1 's primary key range should be 1.1 million, and TABLE2 primary key range is 100,001-200,000, TABLE3 The primary key range is 200,001-300,000, and so on.
When creating a table, we can manually specify the starting value for each table's primary key growth, such as TABLE1 with a starting value of 1,table2 of 100001,table3 to 200001 ...
The Build Table statement is:
SQL code
- CREATE TABLE ' TABLE1 ' (
- ...
- ) auto_increment=1
- CREATE TABLE ' TABLE2 ' (
- ...
- ) auto_increment=100001
- CREATE TABLE ' TABLE3 ' (
- ...
- ) auto_increment=200001
When a table is built, it is straightforward to specify the starting value of the primary key, with minimal maintenance costs and no additional focus on primary key generation issues. However, it is somewhat blunt to declare the starting value directly when building a table, and we can separate the statement from the primary key set statement, which is more convenient and flexible to maintain:
Java code
- CREATE TABLE ' TABLE1 ' (
- ...
- )
- ALTER TABLE ' TABLE1 ' auto_increment=1
- CREATE TABLE ' TABLE2 ' (
- ...
- )
- ALTER TABLE ' TABLE2 ' auto_increment=100001
- CREATE TABLE ' TABLE3 ' (
- ...
- )
- ALTER TABLE ' TABLE3 ' auto_increment=200001
The approximate meaning of the SQL statement should be understood, the mistakes in the inside should not be serious.
2. Use the Stitching method
In this way we do not need to maintain a different primary key ID for each table, each table can be exactly the same as 1.1 million, in the use of table data need to add another identifier in the program, such as table name:
Java code
- Primary key: table1_1001 represents 1001 Data in TABLE1
- Primary key: table2_1001 represents 1001 Data in TABLE2
The advantage of this is to build the table when all the same conditional statements, do not have to maintain a separate primary key similarities and differences, the use of the application needs to be dynamic in the program to determine the data belongs to the table.
such as the primary key ID display can become 1_1001,2_1001, and so on, the table and the Data ID separation.
3. Using external implementations
That is, the value of the primary key is maintained in the program or in a separate table, and the primary key value is taken each time it is inserted.
MySQL Sub-table