Mysql Table Partitioning Detailed _mysql

Source: Internet
Author: User
Tags modulus

One, what is a table partition
A popular table partition is a large table, divided into several small tables according to the criteria. Mysql5.1 began to support datasheet partitioning.
If a user table has more than 6 million records, the table can be partitioned according to the date of storage, or the table can be partitioned according to its location. You can, of course, be partitioned according to other criteria.

Two, why partitioning tables
To improve the scalability, manageability, and database efficiency of large tables and tables with various access patterns. Some of the benefits of
partitioning include:
      1, which can store more data than a single disk or file system partition.
      2), for those data that have lost meaning, it is easy to delete those data by deleting the partitions that are relevant to those data. Conversely, in some cases, the process of adding new data can be easily implemented by adding a new partition specifically for those new data. Other benefits typically associated with partitioning include these listed below. These features in the MySQL partition are not yet implemented, but they are high priority in our priority list, and we want to include them in the 5.1 production release.
      3), some queries can be greatly optimized, primarily by the fact that data that satisfies a given where statement can only exist within one or more partitions, so that the lookup does not have to look for other remaining partitions. Because partitions can be modified after the partition table is created, you can rearrange the data to improve the efficiency of those commonly used queries when you are not doing so when you first configure the partition scheme.
      4), queries involving aggregate functions such as SUM () and count () can be easily handled in parallel. A simple example of this query is "select salesperson_id, COUNT (orders) as Order_total from sales GROUP by salesperson_id;". By "parallelism," this means that the query can be done at the same time on each partition, and the result only needs to be the result of a total of all partitions.
      5) to achieve greater query throughput by dispersing data queries across multiple disks.

Iii. type of partition

· Range Partition: Assigns multiple rows to a partition based on a column value that belongs to a given contiguous interval.
· List partition: Similar to a range partition, the difference is that the list partition is selected based on a column value matching a value in a discrete-value collection.
· Hash partition: A partition that is selected based on the return value of a user-defined expression, calculated using the column values of the rows that will be inserted into the table. This function can contain any expression that is valid in MySQL that produces a non-negative integer value.
· Key partitions: Similar to a hash partition, except that the key partition only supports the calculation of one or more columns, and the MySQL server provides its own hash function. You must have one or more columns that contain integer values.

1.RANGE partition

Multiple rows are allocated to partitions based on column values that belong to a given contiguous interval. These intervals are contiguous and cannot overlap, and are defined using values less than operators. The following is an example.

Copy Code code as follows:
CREATE TABLE Employees (
ID INT not NULL,
FName VARCHAR (30),
LName VARCHAR (30),
Hired DATE not NULL DEFAULT ' 1970-01-01 ',
Separated DATE not NULL DEFAULT ' 9999-12-31 ',
Job_code INT not NULL,
store_id INT not NULL
)

Partition by RANGE (store_id) (
Partition P0 VALUES less THAN (6),
Partition P1 VALUES less THAN (11),
Partition P2 VALUES less THAN (16),
Partition P3 VALUES less THAN (21)
);

According to this partitioning scheme, all rows of employees working in stores 1 to 5 are saved in the partition P0, and the store 6 to 10 employees are saved in P1, and so on. Note that each partition is defined sequentially, from lowest to highest. This is required by the partition by RANGE syntax, which is similar to the "switch ... case" statement in C or Java.
For a new row containing data ("Michael", "Widenius", "1998-06-25", "NULL, 13"), it is easy to determine that it will be inserted into the P2 partition, but what happens if a store numbered 21st is added? Under this scenario, because there is no rule to include a store that is larger than 20, the server will not know where to save the store_id, causing an error. To avoid this error, you can use a "catchall" value less than clause in the CREATE TABLE statement to provide all values that are greater than the specified highest value:

Copy Code code as follows:
CREATE TABLE Employees (
ID INT not NULL,
FName VARCHAR (30),
LName VARCHAR (30),
Hired DATE not NULL DEFAULT ' 1970-01-01 ',
Separated DATE not NULL DEFAULT ' 9999-12-31 ',
Job_code INT not NULL,
store_id INT not NULL
)

PARTITION by RANGE (store_id) (
PARTITION p0 VALUES less THAN (6),
PARTITION p1 VALUES less THAN (11),
PARTITION P2 VALUES less THAN (16),
PARTITION P3 VALUES less THAN MAXVALUE
);


The MAXVALUE represents the largest possible integer value. All rows with a store_id column value greater than or equal to 16 (the highest value defined) are now saved in the partition P3. At some point in the future, when the number of stores has grown to 25, 30, or more, you can use the ALTER TABLE statement to add new partitions for stores 21-25, 26-30, and so on.
In almost the same structure, you can also split a table based on the employee's work code, that is, a contiguous interval based on the Job_code column value. For example--suppose a 2-digit work code is used to represent a common (shop) worker, three digital codes represent offices and support personnel, and four digital codes represent management, and you can create the partition table using the following statement:
Copy Code code as follows:
CREATE TABLE Employees (
ID INT not NULL,
FName VARCHAR (30),
LName VARCHAR (30),
Hired DATE not NULL DEFAULT ' 1970-01-01 ',
Separated DATE not NULL DEFAULT ' 9999-12-31 ',
Job_code INT not NULL,
store_id INT not NULL
)

PARTITION by RANGE (Job_code) (
PARTITION p0 VALUES less THAN (100),
PARTITION p1 VALUES less THAN (1000),
PARTITION P2 VALUES less THAN (10000)
);


In this example, all rows related to the shop worker are saved in the partition P0, and all the lines related to the office and support staff are saved in the partition P1, and all the management-related rows are saved in the partition P2.
It is also possible to use an expression in the values less THAN clause. The most notable limitation here is that MySQL must be able to evaluate the return value of an expression as part of the less THAN (<) comparison; therefore, the value of an expression cannot be null. For this reason, the hired, separated, Job_code, and store_id columns of the employee table have been defined as non-null (not null).
In addition to dividing the table data according to the store number, you can divide the table data using an expression based on one of the two date (dates). For example, suppose you want to split a table based on the year in which each employee left the company, that is, the value of separated. An example of a CREATE TABLE statement that implements this partitioning pattern is as follows:
Copy Code code as follows:
CREATE TABLE Employees (
ID INT not NULL,
FName VARCHAR (30),
LName VARCHAR (30),
Hired DATE not NULL DEFAULT ' 1970-01-01 ',
Separated DATE not NULL DEFAULT ' 9999-12-31 ',
Job_code INT,
store_id INT
)

PARTITION by RANGE (year (separated)) (
PARTITION p0 VALUES less THAN (1991),
PARTITION p1 VALUES less THAN (1996),
PARTITION P2 VALUES Less THAN (2001),
PARTITION P3 VALUES less THAN MAXVALUE
);


In this scenario, records of all employees employed 1991 years ago are kept in the P0, and records of all employees employed from 1991 to 1995 are kept in the partition P1, and records of all employees employed during 1996 to 2000 are kept in the partition P2. Information on all workers employed after 2000 years is kept in P3.
Range partitions are particularly useful in the following situations:
1, when you need to delete the "old" data on a partition, only delete the partition. If you use the partitioning scheme given in the last example above, you simply use "ALTER TABLE employees DROP PARTITION p0" To remove all the lines that correspond to all employees who stopped working 1991 years ago. For a table with a large number of rows, this is much more effective than running a delete query such as "Delete from employees where year (separated) <= 1990."
2. You want to use a column that contains a date or time value, or a value that contains values that begin to grow from some other series.
3), frequently run queries that depend directly on the columns used to partition the table. For example, when executing a query such as "SELECT COUNT (*) from Employees WHERE Year (separated) = GROUP by store_id", MySQL can quickly determine that only the partition P2 needs to be scanned, This is because the remaining partitions cannot contain any records that conform to the WHERE clause.
Note: This optimization is not yet enabled in the MySQL 5.1 source program, but the work is in progress.

2.LIST partition

Similar to a range partition, the difference is that the list partition is selected based on a column value matching a value in a discrete-value collection.
The LIST partition is implemented by using PARTITION by list (expr), where "expr" is a column value or an expression based on a column value, and returns an integer value, and then defines each partition by means of "values in" (value_list). where "Value_list" is a comma-delimited list of integers.
Note: In MySQL 5.1, when using the list partition, it is possible to match only the list of integers.

Copy Code code as follows:
CREATE TABLE Employees (
ID INT not NULL,
FName VARCHAR (30),
LName VARCHAR (30),
Hired DATE not NULL DEFAULT ' 1970-01-01 ',
Separated DATE not NULL DEFAULT ' 9999-12-31 ',
Job_code INT,
store_id INT
);

Suppose there are 20 video stores, distributed in 4 distribution areas, as shown in the following table:
====================
Area Store ID number
------------------------------------
North 3, 5, 6, 9, 17
Eastern 1, 2, 10, 11, 19, 20
West End 4, 12, 13, 14, 18
Center Area 7, 8, 15, 16
====================
To split a table in a way that is stored in the same partition as a row belonging to the same area store, you can use the following "CREATE TABLE" statement:
Copy Code code as follows:
CREATE TABLE Employees (
ID INT not NULL,
FName VARCHAR (30),
LName VARCHAR (30),
Hired DATE not NULL DEFAULT ' 1970-01-01 ',
Separated DATE not NULL DEFAULT ' 9999-12-31 ',
Job_code INT,
store_id INT
)

PARTITION by LIST (store_id)
PARTITION Pnorth VALUES in (3,5,6,9,17),
PARTITION peast VALUES in (1,2,10,11,19,20),
PARTITION pwest VALUES in (4,12,13,14,18),
PARTITION pcentral VALUES in (7,8,15,16)
);


This makes it easy to add or remove employee records from a specified area in a table. For example, assume that all video stores in the western sector are sold to other companies. So all records (rows) related to work employees in the Western video store can be deleted by using the query "ALTER TABLE employees DROP PARTITION pwest" To delete, which has the same effect as delete (delete) query DELETE from Employees WHERE store_id in (4,12,13,14,18); "It's much more effective than that.
Important: If you try to insert a column value (or the return value of a partition expression) that is not in a row in the list of partition values, the Insert query fails with an error. For example, assuming that the list partition uses the above scenario, the following query will fail:
Copy Code code as follows:
INSERT into Employees VALUES (224, ' Linus ', ' Torvalds ', ' 2002-05-01 ', ' 2004-10-12 ', 42, 21);

This is because the "store_id" column value 21 cannot be found in the list of values that are used to define partitions Pnorth, Peast, Pwest, or pcentral. It is important to note that the list partition does not have a definition that contains other values, such as "values less THAN MAXVALUE". Any values that will be matched must be found in the list of values.

The list partition can be combined with a range partition to generate a composite child partition, and it is possible to combine a hash and a key partition to produce a composite sub partition.

3.HASH Partition

A partition that is selected based on the return value of a user-defined expression, calculated using the column values of the rows that will be inserted into the table. This function can contain any expression that is valid in MySQL that produces a non-negative integer value.
To split a table using a HASH partition, add a PARTITION by hash (expr) clause to the CREATE TABLE statement, where "expr" is an expression that returns an integer. It can simply be the name of a column in which the field type is a MySQL integral type. In addition, you will most likely need to add a "partitions num" clause later, where num is a non-negative integer that indicates how much of the table will be divided into partitions.

Copy Code code as follows:
CREATE TABLE Employees (
ID INT not NULL,
FName VARCHAR (30),
LName VARCHAR (30),
Hired DATE not NULL DEFAULT ' 1970-01-01 ',
Separated DATE not NULL DEFAULT ' 9999-12-31 ',
Job_code INT,
store_id INT
)
PARTITION by HASH (store_id)
Partitions 4;

If you do not include a partitions clause, the number of partitions will default to 1. Exception: For NDB Cluster (clustered) tables, the default number of partitions will be the same as the number of cluster data nodes,
This fix may be to consider any max_rows setting to ensure that all rows are properly inserted into the partition.
1.) Liner HASH
MySQL also supports the linear hashing function, which differs from a regular hash in that the linear hash function uses a linear 2 power (powers-of-two) algorithm, whereas a regular hash uses a modulus for the value of the hash function.
The only difference in syntax between a linear hash partition and a regular hash partition is that the "LINEAR" keyword is added to the "PARTITION by" clause.

Copy Code code as follows:
CREATE TABLE Employees (
ID INT not NULL,
FName VARCHAR (30),
LName VARCHAR (30),
Hired DATE not NULL DEFAULT ' 1970-01-01 ',
Separated DATE not NULL DEFAULT ' 9999-12-31 ',
Job_code INT,
store_id INT
)
PARTITION by LINEAR HASH (year (hired))
Partitions 4;

Suppose an expression of expr, when using the linear hash function, the partition to which the record will be saved is the partition n in the NUM partition, where n is based on the following algorithm:
1. To find the next power greater than Num. 2, we call this value V, which can be obtained by using the following formula:
2. V = Power (2, CEILING (LOG (2, num))
(for example, suppose Num is 13.) So log (2,13) is 3.7004397181411. CEILING (3.7004397181411) is 4, then V = Power (2,4), which is equal to 16.
3. Set N = F (column_list) & (V-1).
4. When N >= Num:
· Set V = Ceil (V/2)
· Set n = N & (V-1)
For example, suppose the table T1, using a linear hash partition with 4 partitions, was created by the following statement:
CREATE TABLE T1 (col1 INT, col2 CHAR (5), col3 DATE)
PARTITION by LINEAR HASH (year (COL3))
Partitions 6;
Now suppose you want to insert two rows into the table T1, one of the records col3 the column value ' 2003-04-14 ', and the other record col3 the column value ' 1998-10-19 '. The partition to which the first record will be saved is determined as follows:
Copy Code code as follows:
V = Power (2, CEILING (LOG (2,7)) = 8
N = year (' 2003-04-14 ') & (8-1)
= 2003 & 7
= 3

(3 >= 6 is False (false): The record will be saved to the #3 partition)
The number of the partition to which the second record will be saved is calculated as follows:
Copy Code code as follows:
V = 8
N = year (' 1998-10-19 ') & (8-1)
= 1998 & 7
= 6
(6 >= 4 is True (TRUE): Additional steps are required)
N = 6 & CEILING (5/2)
= 6 & 3
= 2


(2 >= 4 is False (false): The record will be saved to the??
The advantage of linear hash partitioning is that adding, removing, merging, and splitting partitions becomes quicker and facilitates processing of tables that contain extremely large amounts of (1000 gigabytes) of data. Its disadvantage is that with the use of
The distribution of data between different partitions is not likely to be balanced compared with the data distributions of regular hash partitions.

4.KSY partition

Similar to a hash partition, the difference is that the key partition only supports the calculation of one or more columns, and the MySQL server provides its own hash function. You must have one or more columns that contain integer values.

Copy Code code as follows:
CREATE TABLE tk (
Col1 INT not NULL,
Col2 CHAR (5),
Col3 DATE
)
PARTITION by LINEAR KEY (col1)
Partitions 3;

Using keyword linear in a key partition and having the same effect in a hash partition, the partition number is obtained by a power (powers-of-two) algorithm of 2, rather than through a modulus algorithm.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.