MySQL partition and sub-table

Last Update:2016-12-20 Source: Internet

Author: User

Tags dba one table

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Partition

Partitioning is the distribution of files and indexes of a data table in different physical files.

The types of partitions supported by MySQL include range, List, Hash, Key, where range is more commonly used:

Range partition : Assigns multiple rows to a partition based on column values that belong to a given contiguous interval.

List Partitioning : Similar to by range partitioning, the difference is that a list partition is selected based on a value in a set of discrete values that match a column value.

Hash partition : A partition that is selected based on the return value of a user-defined expression that is evaluated using the column values of those rows that will be inserted into the table. This function can contain any expression that is valid in MySQL that produces a non-negative integer value.

Key Partitioning : Similar to partitioning by hash, the difference is that the key partition only supports the calculation of one or more columns, and the MySQL server provides its own hash function. You must have one or more columns that contain integer values.

Case:

Create a user table with ID to partition ID less than 10 in user_1 partition ID less than 20 in user_2 partition

CREATE TABLE User (    intnull  auto_increment,    username varchar (10  ),    = InnoDB charset=utf8partition by range (ID) (    partition user_1 values less than ( Ten ),    partition user_2 values less than ());

Add partition after Setup:

MaxValue indicates that the maximum value of this ID, which is greater than or equal to 20, is stored in the user_3 partition

ALTER TABLE user add partition (    partition user_3 values less than maxvalue);

To delete a partition:

Alter  table user drop partition user_3;

Now open the MySQL data directory

You can see more user#p#user_1.ibd and user#p#user_2.ibd these two files

If the storage engine used by the table is of type MyISAM, it is:

User#p#user_1.myd,user#p#user_1.myi and User#p#user_2.myd,user#p#user_2.myi

Thus, MySQL saves data to different files through partitioning, and the index is partitioned. The size of the individual database file index files after partitioning is significantly reduced relative to the unpartitioned table, and the efficiency is clearly indicated. You can insert a piece of data and then parse the query statement to verify that:

Insert into user values (null,' test 'Select from where id =1;

You can see that this query was executed only in the user_1 partition.

The efficiency of the partition depends on how much data is needed. You can further improve the I/O throughput of your system by placing different partitions on different disks with the DATA directory and INDEX directory options when partitioning.

The choice of partition type, usually using the range type, but in some cases, such as master-slave structure, the primary server rarely uses ' select ' query, the use of range type partition on the primary server is usually not much significance, at this time using hash type partitioning better for example:

Partition by hash (ID) partitions 10;

When inserting data, the data is scattered evenly on each partition according to the ID bar, and the update operation becomes faster due to the small file size and high efficiency.

The fields that are used at the time of partitioning are usually partitioned by a temporal field, depending on the requirements. There are many ways to divide your application, such as on-time or user, which type of partition to choose. If the use of master-slave structure may be more flexible, some from the server use time, some use the user. However, when executing the query, the program should be responsible for selecting the actual server query, writing a MySQL proxy script should be transparent implementation.

Restrictions on partitioning:

1. The primary key or unique index must contain a partition field, such as primary key (Id,username), but the InnoDB's large build performance is not good.

2. In many cases, using a partition does not use the primary key, otherwise it may affect performance.

3. You can only partition by a field of type int or an expression that returns an int type, usually using a function such as year or to_days (MySQL 5.6 Limits the restriction to start releasing).

4. Each table has a maximum of 1024 partitions, and multiple partitions consume large amounts of memory.

5. Partitioned tables do not support foreign keys, and related logical constraints need to be implemented using a program.

6. Partitioning may cause indexing to fail, requiring verification of zoning feasibility.

Detailed Partitioning mode:

* Range-this mode allows DBAs to divide data into different ranges. For example, DBAs can divide a table into three partitions by year, data for the 80 's (1980 's), data for the 90 's (1990 's), and any data after 2000 (including 2000).

the CREATE TABLE users (         ID INT UNSIGNED not NULL auto_increment PRIMARY KEY,         usersname VARCHAR ( c4> ",         email VARCHAR (  PARTITION) by  RANGE (ID) (         PARTITION p0 Values less THAN (3000000),               PARTITION p1 values less THAN (6000000
    ),             PARTITION p2 values less THAN (9000000),              PARTITION P3 values less THAN MAXVALUE     );

Here, the user table is divided into 4 partitions, each 3 million records as the boundary, each partition has its own independent data, index file directory.

You can also increase the disk IO throughput by separating the physical disks on which these partitions reside completely separate.

the CREATE TABLE users (ID INT UNSIGNED not NULL auto_increment PRIMARY KEY, Usersname VARCHAR ( -) Not NULL DEFAULT"', email VARCHAR ( -) Not NULL DEFAULT"') PARTITION by RANGE (ID) (PARTITION p0 VALUES less THAN (3000000) DATA DIRECTORY='/data0/data'INDEX DIRECTORY='/data0/index', PARTITION p1 VALUES less THAN (6000000) DATA DIRECTORY='/data1/data'INDEX DIRECTORY='/data1/index', PARTITION p2 VALUES less THAN (9000000) DATA DIRECTORY='/data2/data'INDEX DIRECTORY='/data2/index', PARTITION P3 VALUES less THAN MAXVALUE DATA DIRECTORY='/data3/data'INDEX DIRECTORY='/data3/index'  );

* List (pre-defined list) – This mode allows the system to be segmented by the row data corresponding to the values of the DBA-defined list. For example, the DBA partitions according to the type of user.

CREATE TABLE User (id INT UNSIGNED not NULL auto_increment PRIMARY KEY, name VARCHAR ( -) Not NULL DEFAULT"', User_typeintNotNULL) PARTITION by LIST (user_type) (PARTITION p0 VALUES in (0,4,8, A), PARTITION p1 VALUES in (1,5,9, -), PARTITION p2 VALUES in (2,6,Ten, -), PARTITION P3 VALUES in (3,7, One, the)   );

Divided into 4 zones, the partition can also be set up on a separate disk.

* Key (key value) – an extension of the above hash mode, where the hash key is generated by the MySQL system.

CREATE TABLE User (       ID INT UNSIGNED not NULL auto_increment PRIMARY KEY,       name VARCHAR ( /c4>,       email VARCHAR (4   )   (       PARTITION P0,       PARTITION p1,       PARTITION p2,       PARTITION p3);

* Hash (hash) – This mode allows the DBA to calculate the hash key for one or more columns of the table, and finally to partition the data region of the hash code with different values. For example, a DBA can create a table that partitions a table's primary key.

CREATE TABLE User (       ID INT UNSIGNED not NULL auto_increment PRIMARY KEY,       username VARCHAR (
   
     > "
    ,       email VARCHAR (
    4 
    
    
      )  
     (       PARTITION p0,       PARTITION p1,       PARTITION p2,     PARTITION p3  );

Divided into 4 zones, the partition can also be set up on a separate disk.

= Partition Management =

Delete Partition

ALERT TABLE users DROP PARTITION p0;

Rebuilding partitions

RANGE Partition rebuild

ALTER TABLE users REORGANIZE PARTITION p0,p1 into (PARTITION p0 VALUES less THAN (6000000));

Merge the original P0,P1 partitions and place them in the new P0 partition.

LIST Partition rebuild

ALTER TABLE users REORGANIZE PARTITION p0,p1 into (PARTITION p0 VALUES in (0,1,4,5, C6>8,9,());

Merge the original P0,P1 partitions and place them in the new P0 partition.

Hash/key Partition reconstruction

2;

The number of reconstructed partitions in REORGANIZE mode becomes 2, where the number can only be reduced and cannot be increased. You want to add the Add PARTITION method.

New Partition

New RANGE partition

ALTER TABLE user add partition (partition user_3 values less than maxvalue);

New LIST partition

ALTER TABLE category ADD PARTITION (PARTITION P4 VALUES in (+, +,+));

New Hash/key partition

8;

Expand the total number of partitions to 8.

Add a partition to an existing table

ALTER TABLE results partition by RANGE (month (ttime)) (partition P0 VALUES less THAN (1), PARTITION p1 VALUES less THAN (2), PARTITION p2 VALUES less THAN (3), PARTITION P3 VALUES less THAN (4), PARTITION P4 VALUES less THAN (5), PARTITION P5 VALUES less THAN (6), PARTITION P6 VALUES less THAN (7), PARTITION P7 VALUES less THAN (8), PARTITION P8 VALUES less THAN (9), PARTITION p9 VALUES less THAN (Ten), PARTITION P10 VALUES less THAN ( One), PARTITION p11 VALUES less THAN ( A), PARTITION P12 VALUES less THAN ( -) );

Sub-table

Tables and partitions are similar, except that partitioning is a logical table file that is stored in several physical files, while a table divides the original table into several tables. You can use a union or a view when you make a table query.

The dividing table is divided vertically and horizontally, in which the horizontal division is most commonly used. Horizontal segmentation usually refers to slicing into another database or table. For example, for a membership table, split by the modulo of 3:

Table = id%3

If id%3 = 0 puts the user data into the USER_0 table, such as id%3=1 into the user_1 table, and so on.

Here is a question, this UID should be all the members in order to grow, but how did he get it? Using Auto_increment is not a good thing, so we use the sequence.

For some traffic statistics system, the amount of data is relatively large, and the attention to the past data is not high, then by the year, month, day table, the daily statistics into a table named by date, or by the increment of the table, such as each table 1 million data, more than 1 million into the second table. It is also possible to divide the table by hash, but it is most common and easy to expand by date and modulo remainder.

You may encounter new problems after the table, that is, query, paging and statistics. The common approach is to process in the program, the auxiliary view.

Use a sub-table case:

Case 1:

The member data to 5 modulo, put in 5 tables, how to query member data:

1. The known ID query member data, the code is as follows:

<?PHP//querying individual member data$customer _table ='Customer'. $id%5; $sql='SELECT * from'. $customer _table.'WHERE customer_id ='. $id;//Search all member data$sql ="'; $tbale= ['Customer0','Customer1','Customer2','Customer3','Customer4'];foreach($table as$v) {$sql.='SELECT * from'. $v.'Union';} $sql= substr ($sql,0,-5);?>

This allows you to query the data of a member or all of its members. In the same way, paging can be used in this large set of limit. But this will have a question, to connect all the tables to query and part of the table is no different, in fact, in the actual application, it is not possible to view all the membership information, a view of 20 and then paging. There is absolutely no need to make a union, only one table can be queried, the only thing to consider is the interface at the 0 point of paging. In fact, is this convergence so important? Even if several data differences occur occasionally, there is no impact on the business.

2. Associating with other tables is similar to 1.

3. Search for user information according to the member's name. With this requirement, you need to search all the tables and summarize the results. Although this has resulted in multiple queries, it does not represent inefficiency. A good SQL statement executes 10 times and performs a faster than a bad SQL statement.

Case 2:

In a traffic monitoring system, because of the huge network traffic, statistics are very large, need to according to the talent table. Get the data for any day, week or month first.

1. Need data for any day. Check the data sheet of the day directly.

2. It takes a few days of data. Love to query the data for the past few days, and then to summarize.

3. You need to query the data for one week. The week's data is regularly aggregated into a week table, which is queried from within the table. This summary process can be done by an external program or by a regular script.

4. Query data for one months. Summarize all this month's data into the month table, which is queried in this table.

5. Query for detailed data within 5 months. Not supported. Only up to 3 months of detailed data is supported. Data has not been archived for 3 months. In the processing of big data, some sacrifices must be made. For data beyond 3 months, only statistics are provided, and detailed data needs to be reviewed for archiving. For 90 days or 180 days, setting a line for data retention is also a common practice for most of these systems, and data will no longer be available for more than 90 days. For example, mobile call logs are kept for up to six months, or 180 days, and data beyond this range is not available for queries. If you really need it, you may want to contact a mobile engineer.

Before the table should be as far as possible according to the actual business to the table, reference to which fields in the query play a role, that the fields to the table, and need to estimate the size of the scale before the table, that is, the first to determine the rules in the sub-table.

For the operation after the table, it is still the basic operation of union query, view, or merge the data with the merge engine and query in this table. Complex operations require the use of stored procedures to achieve the management of the tables with external tools.

For larger data, the balance of function and efficiency must be taken into account, regardless of whether or not a table is to be divided, and a functional concession is made. We cannot accommodate our users in everything, but we should limit some of the functions that affect efficiency. For example, the mobile company's 180-day limit, the forum is forbidden to reply to old posts, etc.

MySQL partition and sub-table

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More