Database speed airship Source sale of the sub-list-vertical

Source: Internet
Author: User

First, the database bottleneck
Whether it is an IO bottleneck or a CPU bottleneck, the end result is an increase in the number of active connections to the database, which can approximate or even reach the threshold of the number of active connections the database hosts. In the Business Service view, there are fewer connections available or even no connections available. Next you can imagine it (concurrency, throughput, crashes).

IO bottleneck
The first: disk read IO bottleneck, hot data too much, database cache, each query will produce a large number of random IO, reduce the query speed, such as library and vertical sub-table.

The second: Network IO bottleneck, the requested data too much, network bandwidth is not enough, such as sub-Library.

CPU bottlenecks
The first kind of speed airship source sale Q3266397597 "dashengba.com Holy Source Forum" "Bbsapple.com Apple Source Forum", such as SQL contains Join,group By,order by, non-indexed field condition query, etc. SQL optimization, which increases CPU operations, establishes appropriate indexes for business computing at the business service level.

The second kind: the single-table data is too large, the query scan too many rows, SQL inefficient, increase the CPU operation, such as the level of the table.

Second, sub-database sub-table
Horizontal Sub-Library

Concept: Based on the field, according to a certain strategy (hash, range, etc.), the data in a library split into multiple libraries.
Results:
The structure of each library is the same;
The data of each library is different, there is no intersection;
The aggregate of all libraries is full-volume data;
Scenario: The absolute concurrency of the system comes up, the table is difficult to fundamentally solve the problem, and there is no obvious business ownership to vertical sub-Library.
Analysis: More libraries, IO and CPU pressure naturally can be multiplied.
Horizontal Sub-table

Concept: The data in a table is split into multiple tables based on a field, according to a certain strategy (hash, range, etc.).
Results:
The structure of each table is the same;
The data of each table is different, there is no intersection;
The aggregate of all the tables is the full amount of data;
Scenario: The absolute concurrency of the system does not come up, but the amount of data in a single table is too much, affecting the SQL efficiency, aggravating the CPU burden, so as to become a bottleneck.
Analysis: The amount of data in the table is low, the single SQL execution efficiency is high, and the CPU burden is reduced naturally.
Vertical Sub-Library

Concept: Based on a table, different tables are split into different libraries according to the business attribution.
Results:
The structure of each library is different;
The data of each library is not the same, there is no intersection;
The aggregate of all libraries is full-volume data;
Scenario: The absolute concurrency of the system comes up, and individual business modules can be abstracted out.
Analysis: To this step, basically can be served. For example, as the business grows, a number of common configuration tables, dictionary tables, and so on, can be split into separate libraries and even serviced. Again, with the development of business hatched a business model, then the related tables can be split into a separate library, and even can be serviced.
Vertical Sub-table

Concept: Based on fields, the fields in the table are split into different tables (primary and extension tables), depending on the activity of the fields.
Results:
The structure of each table is different;
The data for each table is not the same, in general, the field of each table has at least one column intersection, is usually the primary key, used to correlate the data;
The aggregate of all the tables is the full amount of data;
Scenario: The absolute concurrency of the system does not come up, the table records are not many, but the field is many, and the hot and non-hot data together, single-row data requires a large amount of storage space. So that the database cache data rows are reduced, the query will read the disk data generated a large number of random read Io, resulting IO bottleneck.
Analysis: You can use the list page and the details page to help understand. The split principle of the vertical sub-table is to put hot-spot data (data that may be queried together frequently) together as the primary table, and non-hotspot data together as an extension table. This allows more hotspot data to be cached, which in turn reduces the random read IO. After you've torn it down, you need to correlate two tables to fetch the data to get all the data. But remember, never join, because join not only increases CPU load but also two tables are coupled (must be on a DB instance). Associated data, the business service layer should be a fuss, get the main table and extension table data, and then associated with the associated fields to get all the data.
Third, the sub-database Sub-table tool
Sharding-sphere:jar, formerly known as SHARDING-JDBC;
Tddl:jar,taobao Distribute Data Layer;

Mycat: Middleware.
Note: The pros and cons of the tool, please self research, official website and community first.

Four, sub-database table step
Based on capacity (current capacity and growth) to assess the number of sub-libraries or sub-tables, select the key (uniform), the table rules (hash or range, etc.), perform (general double write)----scaling problem (minimizing the movement of data).

Five, sub-database sub-table problems
Query problem of non-partition key (horizontal sub-Library sub-table, split strategy is commonly used hash method)
Only one non-partition key on the side except partition key as a condition query
Mapping method

Gene Law

Note: When writing, the genetic method generates USER_ID. About the Xbit gene, for example, to divide 8 tables, 23=8, so X takes 3, that is 3bit gene. Depending on the user_id query, you can directly take the module to the corresponding sub-library or sub-table. Based on the user_name query, the User_name_code generation function is first generated user_name_code and then the module is routed to the corresponding sub-library or sub-table. ID generates common snowflake algorithms.

On the side except partition key more than one non-partition key as condition query
Mapping method

Redundancy method

Note: When you follow order_id or buyer_id queries, route to the Db_o_buyer Library and route to the Db_o_seller library as seller_id queries. It feels a little upside down! Is there any other good way to do it? Change the technology stack?

Background in addition to partition key and a variety of non-partition key combination conditions query
NoSQL method

Redundancy method

Note: Think about the scope of the method of splitting, how to solve.

Non-partition key cross-Library cross-table paging query problem (horizontal sub-Library sub-table, split strategy for common hash method)
Note: Use the NoSQL method to solve (es, etc.).

Scaling problem (horizontal sub-Library sub-table, split strategy is a common hash method)
Horizontal expansion library (upgrade from library method)

Note: The expansion is multiplied.

Horizontal expansion table (double write Migration method)

First step: (Synchronous double Write) application configuration double write, deploy;
The second step: (Synchronous double write) copy old data from old library to the new library;
The third step: (Synchronous double write) to the old library to proofread the old data in the new library;
Fourth step: (Synchronous double Write) application remove double write, deploy;
Note: Bi-write is a general scheme, think about the scope of the method of splitting, how to solve.

Summary of sub-database sub-table
The sub-database, first of all to know where the bottleneck, and then reasonable to split (sub-library or sub-table?) Horizontal OR vertical? How many points? )。 and must not be split in order to divide the database into a table.
Key selection is important, both to consider the split evenly, but also to consider the non-partition key query.
As long as you can meet the requirements, the simpler the split rule, the better.

Database speed airship Source sale of the sub-list-vertical

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.