MySQL database partition function tutorial, mysql tutorial

Source: Internet
Author: User

MySQL database partition function tutorial, mysql tutorial

Zero: What is database partition?
Let's talk about what is a database partition. Take mysql as an example. The data in the mysql database is stored on the disk as a file, which is stored under/mysql/data by default (you can use my. in cnf), a table corresponds to three files, one is the frm table structure, the other is the myd table data, and the other is the myi table index. If the data volume of a table is too large, myd and myi will become very large, and the data query will become very slow. At this time, we can use the mysql partition function, physically, the three files corresponding to this table are divided into many small pieces. In this way, when we look for a piece of data, we don't need to look for it all, you only need to know where the data is, and then find it. If the data in the table is too large, it may not be stored on a disk. In this case, we can allocate data to different disks.

1. Horizontal partitioning
What is a horizontal partition? It means Partitioning in a horizontal manner. For example, if there are million pieces of data, divide the data into ten portions, and put the first 10 million pieces of data into the first partition, the second 10 million data records are placed in the second partition, and so on. That is to say, we divide the table into very many tables, and use merge as the root table. When a piece of data is retrieved, the data contains all fields in the table structure, that is, the horizontal partition does not change the table structure.

ALTER TABLE `yl_hospital_url` PARTITION BY RANGE(ID) ( PARTITION `p0` VALUES LESS THAN (100000) ,  PARTITION `p1` VALUES LESS THAN (200000) ,  PARTITION `p2` VALUES LESS THAN (300000) ,  PARTITION `p3` VALUES LESS THAN (400000) ,  PARTITION `p4` VALUES LESS THAN (500000) ,  PARTITION `p5` VALUES LESS THAN (600000) ,  PARTITION `p6` VALUES LESS THAN (700000) , PARTITION `p6` VALUES LESS THAN (700000) ,  PARTITION `p7` VALUES LESS THAN (MAXVALUE)  ) ; 

Pre-partition query speed

Pre-partition query speed

Ii. mysql Partition
I think there is only one way to partition mysql. Instead, we use different algorithms to distribute data to different blocks.
1. MySQL and above support partition functions
During installation and installation, we can check

[root@BlackGhost mysql-5.1.50]# ./configure --help |grep -A 3 Partition  === Partition Support ===  Plugin Name:   partition  Description:   MySQL Partitioning Support  Supports build:  static  Configurations:  max, max-no-ndb 

Check that if the above is found, it indicates that it supports partitioning and is enabled by default. If you have installed mysql

mysql> show variables like "%part%"; +-------------------+-------+ | Variable_name   | Value | +-------------------+-------+ | have_partitioning | YES  | +-------------------+-------+ 1 row in set (0.00 sec) 

Check the variables. If they are supported, the above prompt will be displayed.

2. range partitioning
A table partitioned by RANGE is partitioned by the following method. Each partition contains rows whose partition expression values are located in a given continuous interval.

Create a range Partition Table

Mysql> create table if not exists 'user' (-> 'id' int (11) not null AUTO_INCREMENT COMMENT 'user id',-> 'name' varchar (50) not null default ''comment' name',-> 'sex' int (1) not null default '0' COMMENT '0 is male, 1 is female ', -> primary key ('id')->) ENGINE = MyISAM default charset = utf8 AUTO_INCREMENT = 1-> partition by range (id) (-> PARTITION p0 values less than (3),-> PARTITION p1 values less than (6),-> PARTITION p2 values less than (9 ), -> PARTITION p3 values less than (12),-> PARTITION p4 values less than maxvalue->); Query OK, 0 rows affected (0.13 sec)


Insert some data

Mysql> insert into 'test '. 'user' ('name', 'sex') VALUES ('tank', '0')->, ('zhang ', 1), ('ying ', 1), ('zhang ', 1), ('ying', 0), ('test1', 1), ('tank2', 1)->, ('tank1', 1), ('test2', 1), ('test3', 1), ('test4', 1), ('test5', 1 ), ('tank3', 1)->, ('tank3', 1), ('tank5', 1), ('tank6', 1), ('tank7 ', 1), ('tank8', 1), ('tank9', 1)->, ('tank10', 1), ('tank11', 1 ), ('tank12', 1), ('tank13', 1), ('tank21', 1), ('tank42', 1); Query OK, 25 rows affected (0.05 sec) Records: 25 Duplicates: 0 Warnings: 0


Go to the location where the database table files are stored. There is a configuration in my. cnf, which is followed by datadir.

[root@BlackGhost test]# ls |grep user |xargs du -sh 4.0K  user#P#p0.MYD 4.0K  user#P#p0.MYI 4.0K  user#P#p1.MYD 4.0K  user#P#p1.MYI 4.0K  user#P#p2.MYD 4.0K  user#P#p2.MYI 4.0K  user#P#p3.MYD 4.0K  user#P#p3.MYI 4.0K  user#P#p4.MYD 4.0K  user#P#p4.MYI 12K  user.frm 4.0K  user.par 


Retrieve Data

mysql> select count(id) as count from user; +-------+ | count | +-------+ |  25 | +-------+ 1 row in set (0.00 sec) 


Delete the fourth Partition

mysql> alter table user drop partition p4; Query OK, 0 rows affected (0.11 sec) Records: 0 Duplicates: 0 Warnings: 0 


/** The data stored in the partition is lost. The fourth partition contains 14 data records, and the remaining three partitions have only 11 data records, however, the size of the collected files is 4.0 K. from this we can see that the minimum partition size is 4 K */mysql> select count (id) as count from user; + ------- + | count | + ------- + | 11 | + ------- + 1 row in set (0.00 sec)


The fourth block has been deleted.

[root@BlackGhost test]# ls |grep user |xargs du -sh 4.0K  user#P#p0.MYD 4.0K  user#P#p0.MYI 4.0K  user#P#p1.MYD 4.0K  user#P#p1.MYI 4.0K  user#P#p2.MYD 4.0K  user#P#p2.MYI 4.0K  user#P#p3.MYD 4.0K  user#P#p3.MYI 12K  user.frm 4.0K  user.par 


/* Partitions can be performed on an existing table, and the data in the table is automatically allocated to the corresponding partition according to the rules. This makes it better and saves a lot of trouble, see the following operations */mysql> alter table aa partition by RANGE (id)-> (PARTITION p1 VALUES less than (1),-> PARTITION p2 VALUES less than (5 ), -> PARTITION p3 VALUES less than MAXVALUE); Query OK, 15 rows affected (0.21 sec) // PARTITION 15 data Records: 15 Duplicates: 0 Warnings: 0


A total of 15

mysql> select count(*) from aa; +----------+ | count(*) | +----------+ |    15 | +----------+ 1 row in set (0.00 sec) 


Delete a partition

mysql> alter table aa drop partition p2; Query OK, 0 rows affected (0.30 sec) Records: 0 Duplicates: 0 Warnings: 0 


There are only 11 entries, indicating that the existing table partition is successful.

mysql> select count(*) from aa; +----------+ | count(*) | +----------+ |    11 | +----------+ 1 row in set (0.00 sec)  

3. list partitions
In LIST partitions, the definition and selection of each partition are based on the value of a column from a value in a Value LIST set, and the RANGE partition is a set of continuous RANGE values.

// This method fails mysql> create table if not exists 'list _ part' (-> 'id' int (11) not null AUTO_INCREMENT COMMENT 'user id ', -> 'province _ id' int (2) not null default 0 COMMENT 'province ',-> 'name' varchar (50) not null default ''comment 'name ', -> 'sex' int (1) not null default '0' COMMENT '0 is male, 1 is female,-> primary key ('id')->) ENGINE = innodb default charset = utf8 AUTO_INCREMENT = 1-> partition by list (province_id) (-> PARTITION p0 values in (1, 2, 3, 4, 5, 6, 7, 8 ), -> PARTITION p1 values in (9,10, 11,12, 16,21),-> PARTITION p2 values in (13,14, 15,19),-> PARTITION p3 values in (17,18, 20,22) ->); ERROR 1503 (HY000): a primary key must include all columns in the table's partitioning function
// This method succeeds mysql> create table if not exists 'list _ part' (-> 'id' int (11) not null comment 'user id ', -> 'province _ id' int (2) not null default 0 COMMENT 'province ',-> 'name' varchar (50) not null default ''comment 'name ', -> 'sex' int (1) not null default '0' COMMENT '0 is male, 1 is female '->) ENGINE = innodb default charset = utf8-> partition by list (province_id) (-> PARTITION p0 values in (1, 2, 3, 4, 5, 6, 7, 8 ), -> PARTITION p1 values in (9,10, 11,12, 16,21),-> PARTITION p2 values in (13,14, 15,19),-> PARTITION p3 values in (17,18, 20,22) ->); Query OK, 0 rows affected (0.33 sec)


If a primary shard exists during list partition creation, the primary key must be in the primary partition; otherwise, an error is reported. If I don't need a primary key, the partition is successfully created. Generally, a table will have a primary key. This is a partition limitation.
If you want to test the data, see test the range partition.

4. hash Partition
HASH partitions are mainly used to ensure that data is evenly distributed among pre-defined partitions. All you need to do is to specify a column value or expression based on the column value to be hashed, the number of partitions to be split into and the specified table to be partitioned.

Mysql> create table if not exists 'hash _ part' (-> 'id' int (11) not null AUTO_INCREMENT COMMENT 'comment id',-> 'comment' varchar (1000) not null default ''comment' COMMENT,-> 'IP' varchar (25) not null default ''comment' source ip',-> primary key ('id ') ->) ENGINE = innodb default charset = utf8 AUTO_INCREMENT = 1-> partition by hash (id)-> PARTITIONS 3; Query OK, 0 rows affected (0.06 sec)

For testing, see operations on range partitions.

5. key Partition
Partitioning by KEY is similar to partitioning by HASH. Except for the User-Defined expression used by HASH partition, the HASH function of KEY partition is provided by the MySQL server.

Mysql> create table if not exists 'key _ part' (-> 'news _ id' int (11) not null comment 'news id ', -> 'content' varchar (1000) not null default ''' COMMENT 'news content',-> 'U _ id' varchar (25) not null default ''comment' source IP address ',-> 'create _ time' date not null default '2017-00-00 00:00:00 'comment' time'->) ENGINE = innodb default charset = utf8-> partition by linear hash (YEAR (create_time)-> PARTITIONS 3; Query OK, 0 rows affected (0.07 sec)

For testing, see operations on range partitions.

6. subpartition
A subpartition is the re-division of each partition in a partition table. A subpartition can use both HASH and KEY partitions. This is also called composite partitioning ).
1. If a subpartition is created in one partition, other partitions must also have subpartitions.
2. If a partition is created, the number of subpartitions in each partition must be the same
3. subpartitions in the same partition have different names. subpartitions in different partitions can have the same names (5.1.50 is not applicable)

Mysql> create table if not exists 'sub _ part' (-> 'news _ id' int (11) not null comment 'news id ', -> 'content' varchar (1000) not null default ''' COMMENT 'news content',-> 'U _ id' int (11) not null default 0 s COMMENT 'source ip',-> 'create _ time' date not null default '2017-00-00 00:00:00 'comment' time'->) ENGINE = innodb default charset = utf8-> partition by range (YEAR (create_time)-> subpartition by hash (TO_DAYS (create_time) (-> PARTITION p0 values less than (1990) (SUBPARTITION s0, SUBPARTITION s1, SUBPARTITION s2),-> PARTITION p1 values less than (2000) (SUBPARTITION s3, SUBPARTITION s4, SUBPARTITION good ), -> PARTITION p2 values less than maxvalue (SUBPARTITION tank0, SUBPARTITION tank1, SUBPARTITION tank3)->); Query OK, 0 rows affected (0.07 sec)

The official website says that subpartitions in different partitions can have the same name, but the following error will be prompted if mysql5.1.50 does not work.
ERROR 1517 (HY000): Duplicate partition name s1
7. Columns Partition
Columns partitions are the partition types introduced in MySQL 5.5. The introduction of Columns partitions solves the problem that RANGE partitions and LIST partitions only support integer partitions before MySQL 5.5, this leads to the need for additional function calculations to obtain integers or to convert them to Integers and then partition through an additional conversion table. Columns partitions can be subdivided into RANGE Columns partitions and LIST Columns partitions. Both RANGE Columns partitions and LIST Columns partitions support integer, date and time, and string data types.
Application scenarios:
Product Sales daily report, annual report, etc.
Every day is divided into one table. The table name is divided into 24 partitions by year, month, and day, and each hour's data is divided into one partition.

CREATE TABLE `year_log` (  `id` int(11) DEFAULT NULL,  `money` int(11) unsigned NOT NULL,  `date` datetime DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 PARTITION BY RANGE (year(date)) ( PARTITION p2007 VALUES LESS THAN (2008), PARTITION p2008 VALUES LESS THAN (2009), PARTITION p2009 VALUES LESS THAN MAXVALUE );  CREATE TABLE `daily_log` ( `id` int(11) NOT NULL, `sid` char(36) NOT NULL, `sname` char(20) DEFAULT NULL, `date` datetime NOT NULL, PRIMARY KEY (`id`,`date`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 PARTITION BY RANGE COLUMNS(`date`) (PARTITION p1 VALUES LESS THAN ('2000-01-02'), PARTITION p2 VALUES LESS THAN ('2000-01-03'), PARTITION p3 VALUES LESS THAN ('2000-01-04'), PARTITION p4 VALUES LESS THAN ('2000-01-05'), PARTITION p5 VALUES LESS THAN ('2000-01-06'), PARTITION p6 VALUES LESS THAN ('2000-01-07'), PARTITION p7 VALUES LESS THAN ('2000-01-08'), PARTITION p367 VALUES LESS THAN (MAXVALUE)); 

Iii. Partition Management
1. delete partitions.

mysql> alter table user drop partition p4;  

2. Add partitions.

Range add new partition mysql> alter table user add partition (partition p4 values less than MAXVALUE); Query OK, 0 rows affected (0.06 sec) Records: 0 Duplicates: 0 Warnings: 0


Add new partition in list
Mysql> alter table list_part add partition (partition p4 values in (25, 26, 28 ));
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0

Hash re-partitioning 

mysql> alter table hash_part add partition partitions 4; Query OK, 0 rows affected (0.12 sec) Records: 0 Duplicates: 0 Warnings: 0 


Key re-partitioning

Mysql> alter table key_part add partition partitions 4; Query OK, 1 row affected (0.06 sec) // If data exists, Records: 1 Duplicates: 0 Warnings: 0


Although I do not specify a subpartition, the system will name the subpartition

mysql> alter table sub1_part add partition(partition p3 values less than MAXVALUE); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 


Mysql> show create table subpartition part \ G; ***************************** 1. row *************************** Table: subpartition part Create Table: create table 'sub1 _ part' ('news _ id' int (11) not null comment 'news id', 'content' varchar (1000) not null default ''comment' news content', 'U _ id' varchar (25) not null default ''comment' source ip ', 'create _ time' date not null default '2017-00-00 'comment' Time') ENGINE = InnoDB DEFAUL T charset = utf8! 50100 partition by range (YEAR (create_time) subpartition by hash (TO_DAYS (create_time) (PARTITION p0 values less than (1990) (SUBPARTITION s0 ENGINE = InnoDB, SUBPARTITION s1 ENGINE = InnoDB, SUBPARTITION s2 ENGINE = InnoDB), PARTITION p1 values less than (2000) (SUBPARTITION s3 ENGINE = InnoDB, SUBPARTITION s4 ENGINE = InnoDB, SUBPARTITION good ENGINE = InnoDB), PARTITION p2 values less than (3000) (SUBPARTITION tank0 ENGINE = InnoDB, SUBPARTITION tank1 ENGINE = InnoDB, SUBPARTITION tank3 ENGINE = InnoDB ), PARTITION p3 values less than maxvalue (SUBPARTITION p3sp0 ENGINE = InnoDB, // SUBPARTITION name is automatically generated SUBPARTITION p3sp1 ENGINE = InnoDB, SUBPARTITION p3sp2 ENGINE = InnoDB )) 1 row in set (0.00 sec)

3. repartition

Range re-partitioning

mysql> ALTER TABLE user REORGANIZE PARTITION p0,p1,p2,p3,p4 INTO (PARTITION p0 VALUES LESS THAN MAXVALUE); Query OK, 11 rows affected (0.08 sec) Records: 11 Duplicates: 0 Warnings: 0  

List partitioning

mysql> ALTER TABLE list_part REORGANIZE PARTITION p0,p1,p2,p3,p4 INTO (PARTITION p0 VALUES in (1,2,3,4,5)); Query OK, 0 rows affected (0.28 sec) Records: 0 Duplicates: 0 Warnings: 0 


Hash and key partitions cannot be REORGANIZE, which is clearly stated on the official website.

mysql> ALTER TABLE key_part REORGANIZE PARTITION COALESCE PARTITION 9; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PARTITION 9' at line 1 

Iv. Advantages of partitioning
1. partitions can be divided into multiple disks to store more disks.
2. Based on the search condition, that is, the condition after the where clause, you do not need to find all the corresponding partitions.
3. Parallel processing can be performed for big data search.
4. distribute data queries across multiple disks to achieve higher query Throughput

Differences and connections between MySQL sub-tables and partitions


I. What are MySQL table shards and partitions?
What is table sharding? On the surface, it means dividing a table into N small tables.
What is a partition? partition refers to dividing data in a table into N blocks, which can be on the same disk or on different disks, for more information, see mysql Partition Function and instance

Ii. What is the difference between mysql sub-tables and partitions?
1. Implementation Method
A) mysql sub-tables are real sub-tables. After a table is divided into many tables, each small table is a positive table, which corresponds to three files. MYD data file ,. MYI index file ,. frm table structure file.

[root@BlackGhost test]# ls |grep user  alluser.MRG  alluser.frm  user1.MYD  user1.MYI  user1.frm  user2.MYD  user2.MYI  user2.frm  

The preceding table sharding uses the merge storage engine (one for table sharding). alluser is a summary table. There are two table shards, user1 and user2. Both of them are stand-alone tables. We can retrieve the data through the summary table. There are no. MYD and. MYI files in the summary table. That is to say, the summary table is not a table and has no data. The data is placed in the table. Let's see what MRG is.

[root@BlackGhost test]# cat alluser.MRG |more  user1  user2  #INSERT_METHOD=LAST  

From the above we can see that alluser. MRG has some table sharding relationships and data insertion methods. The table can be understood as a shell or a connection pool.
B) the partitions are different. After a large table is partitioned, it is still a table and will not be changed to two tables, but the data storage blocks become more.

[root@BlackGhost test]# ls |grep aa  aa#P#p1.MYD  aa#P#p1.MYI  aa#P#p3.MYD  aa#P#p3.MYI  aa.frm  aa.par 

We can see from the above that the aa table is divided into two partitions, p1 and p3, which are originally three partitions and are deleted by me. We all know that a table corresponds to three files. MYD,. MYI,. frm. Partitions split data files and index files according to certain rules. open the par file. after the par file, you can see that the partition information of this table is in the root table. MRG is a bit like. After partitioning, it is still one, rather than multiple tables.

2. Data Processing
A) after table sharding, the data is stored in the table sharding table. The total table is only a shell. The accessed data is stored in one table shard. See the following example:
Select * from alluser where id = '12' on the surface, operations are performed on the table alluser, but they are not. Is to operate the table shards in alluser.
B) Partition. There is no table sharding concept. partitions only divide the files that store data into many small pieces. The partitioned table is still a table. Data processing is done by yourself.

3. Improve Performance
A) after table sharding, The concurrency of a single table is improved, and the disk I/O performance is also improved. Why is the concurrency improved? Because it takes a short time to query a table. If there is a high concurrency, the total table can allocate the concurrency pressure to different small tables based on different queries. How can I achieve high disk I/O performance? A very large. MYD file is now allocated to. MYD in each small table.
B) mysql puts forward the partition concept. I think I want to break through the disk I/O bottleneck and improve the disk read/write capability to increase mysql performance. Eg: a million-row table is divided into 10 partitions, and each partition contains 100,000 rows of data. Therefore, it takes only one tenth of the total table scan time to query the partition, which is obviously compared. At the same time, the indexing speed for 100,000 rows of tables is much faster than that for millions of rows. If you can create these partitions on different disks, the I/O read/write speed will be "unimaginable" at this time (no incorrect word, it's really too fast, in theory, the speed is improved by 100 times. How fast the response speed is, so it is a bit unimaginable)
At this point, the testing focus of partitions and table shards is different. The focus of table shards is on how to improve mysql's concurrency when accessing data. How can partitions break through disk read/write capabilities, to improve mysql performance.

4 ).
A) There are many table sharding methods. Using merge for table sharding is the simplest method. In this way, the root partition is almost easy and transparent to the program code. If other table sharding methods are used, the score zone is troublesome.
B) Partition implementation is relatively simple. There is no difference between creating a partition table and creating a common root table, and it is transparent to the code.

Iii. What is the relationship between mysql table shards and partitions?
1. Both of them can improve mysql's high performance and have a good performance in high concurrency.
2. There is no conflict between table shards and partitions. You can work with each other. For tables with large traffic volumes and large table data volumes, we can combine table shards and partitions (if the merge table sharding method cannot be used with partitions, we can try other table shards). Tables with low access traffic but with a lot of table data, we can adopt the partitioning method.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.