First, database table segmentation Technology
The database table partitioning technique includes the following:
- Split horizontally
- Vertical split
- Library Table Hash
1.1. Horizontal Split
What is horizontal segmentation? To play a comparative image of the metaphor, in the cafeteria to eat, there is only one window, line up the queue of rice is too long, are lined up in the S-type, it is easy to queue people to produce anxiety, prone to confusion, then a manager stand out, add more dozen rice window, the long team waist truncated into several teams. More image a bit of understanding, you take a "scalpel", a big table fierce cut a few knives, the result of this big table, turned into a few small table.
Horizontal segmentation places data into two or more separate tables based on certain criteria . That is, by recording the split, different records can be saved separately, each child table has the same number of columns. Horizontal cutting divides a table into multiple tables. Each table contains the same number of columns, but fewer rows of data. For example, a table with 1 billion rows can be partitioned horizontally into 12 tables, each representing one months of data in a particular year. Any query that requires specific month data only needs to refer to the table for the corresponding month.
The criteria used to split a table horizontally are: DateTime dimension, Region dimension, and, of course, more business dimensions.
Let me give you a few examples to explain
- Case 1: The amount of data on a company's sales record is too large, we can split it horizontally by month, and the sales record for each month is a single table.
- Case 2: A group has branch offices in each region, and the order data sheet is too large for us to cut horizontally according to the area where the branch is located.
- Case 3: A telecommunications company's words by the date, the city level cut, found that the amount of data is too large, and then they are by brand, number segment for horizontal cutting
Horizontal segmentation is typically used in the following situations:
(1) The table data is very large, after segmentation can reduce the number of data and index to be read at the time of query, but also reduce the index layer, speed up the query speed.
(2) The data in the table is inherently independent, for example, the table records data from each region or different periods of time, especially when some of the data is commonly used, while others are not used.
(3) The data needs to be stored on multiple media.
(4) It is necessary to separate the historical data from the current data.
Advantages of horizontal segmentation:
1: Reduce the number of pages of data and indexes that need to be read at query time, as well as reduce the index layer and speed up the query.
Horizontal Segmentation Disadvantages:
1: Horizontal segmentation adds complexity to your application, which usually requires multiple table names at query time, and the union operation is required to query all data. In many database applications, this complexity outweighs its advantages because, as long as the index keyword is not large, when the index is used for querying, the table adds two to three times times the amount of data, and the query increases the number of disks that read an index layer.
1.2, Vertical Division
What is vertical segmentation? To make an image of the metaphor, a small company through a few years to develop into a large multinational enterprises, the previous department structure obviously can not meet the current business development, the CEO crackling the company into the financial department, personnel, production department, sales department ... This is a comparative image of it, there are wood? Oh
split the table vertically (without breaking the third paradigm), place the primary key column and some columns into a table, and then place the primary key column and some other columns in the other table. Divide the original table into multiple tables that contain only fewer columns. If some columns in a table are common, and others are not, you can use vertical partitioning .
Advantages of Vertical partitioning:
1: Vertical segmentation can make the row data smaller, a block of data can hold more data, the query will reduce the number of I/O times (each query when the block is read less).
2: Vertical partition table can be used to maximize the use of the cache.
Vertical Segmentation Disadvantages:
1: After the table is split vertically, the main code (primary key) is redundant and you need to manage redundant columns
2: Causes table join JOIN operation (increases CPU overhead) needs to be circumvented from the business
1.3. Library Table Hash
The table hash is similar to horizontal segmentation, but there is no obvious dividing line such as horizontal segmentation, the hash algorithm is used to disperse the data into each sub-table, so IO is more balanced. In general, we will follow the business or functional modules to separate the database, different modules corresponding to different databases or tables, and then according to a certain policy on a page or function of a smaller database hash, such as user table, according to the user ID of the table hash, hash 128 tables, should be able to improve the performance of the system at a low cost and have a good scalability
(reprint) Analysis of database table segmentation technology (horizontal split/vertical split/library table hash)