MySQL uses the Sub-Library sub-table

Source: Internet
Author: User

MySQL uses the Sub-Library sub-table

1 What is a sub-database table?
Store data that was originally stored in a library on multiple libraries, storing data that was originally stored in a single table on multiple tables.
2 Why do you want to divide the database into tables?

The amount of data in the database is not necessarily controllable, in the case of no sub-database, with the development of time and business, the table in the library will be more and more, the amount of data in the table will be more and more large, correspondingly, the data operation, adding and deleting the cost will be more and more; In addition, because the distributed deployment And the resources of a server (CPU, disk, memory, IO, etc.) are limited, the final database can carry the amount of data, data processing capacity will encounter bottlenecks.
3 The implementation strategy of the sub-list of the library.

There are two types of vertical and horizontal segmentation in the

Sub-Library table.
3.1 What is vertical slicing, which divides the table into different libraries according to the function module, the close degree of the relationship. For example, we will establish the definition of database WORKDB, commodity database paydb, user database UserDB, log database logdb, respectively, for storing Project data definition table, commodity definition table, user data table, log data table and so on.
3.2 What is horizontal slicing, when the amount of data in a table is too large, we can divide the data of that table into a number of rules, such as the UserID hash, and then store it in multiple tables with the same structure, and on different libraries. For example, in our USERDB User data table, each table has a large amount of data, you can divide userdb into the same structure of multiple userdb:part0db, part1db, and then userdb the User data table usertable, Cut into many usertable:usertable0, UserTable1, and so on, and then store these tables on more than one userdb with a certain set of rules.
3.3 Which method should be used to implement the database Library sub-table, this depends on the data volume in the database bottleneck, and comprehensive project business type to consider.
If the database is caused by too many tables and large amounts of data, and the business logic of the project is clear and low-coupling, then the rules are simple and easy to implement vertical segmentation must be preferred.
And if there are not many tables in the database, but the amount of data in a single table is very large, or the data is very hot, this situation should choose horizontal segmentation, Horizontal segmentation than vertical segmentation to be more complex, it will logically belong to the data of the physical segmentation, in addition to the segmentation of the granularity of the segmentation to do a good job, Considering data averaging and load averaging, the latter will also incur additional data management burdens for project personnel and applications.
in real-world projects, these two situations are often both, which requires a trade-off, even if both vertical and horizontal segmentation are required. Our game project uses a combination of vertical and horizontal segmentation, where we first split the database vertically and then slice it horizontally for a subset of tables, usually user data tables. Problems with the
4 sub-Library table.

4.1 Transaction issues.
After performing the Sub-library table, database transaction management is difficult because the data is stored on different libraries. If you rely on the Distributed transaction management function of the database itself to perform transactions, it will pay a high performance cost, if the application to assist control, the formation of procedural logic transactions, but also cause a programming burden.
4.2 Cross-Library join issues across tables.
After performing the partition table, it is difficult to avoid the original logical association of the data into different tables, different libraries, then the association operation of the table will be limited, we can not join the table in different sub-libraries, also cannot join table granularity different table, the result of a query can be completed business, Multiple queries may be required to complete.
4.3 Additional data management burden and data computing pressure.
The additional data management burden, the most obvious is the location of the data and the data deletion and re-examination of the recurrence of the problem, which can be resolved through the application, but inevitably cause additional logic operations, for example, for a record user performance of the User data table usertable, Business requirements to identify the best 100-bit, before the table, only an order BY statement can be done, but after the table, will need n ORDER BY statement, each of the first 100 user data of each table, and then the data to be combined to calculate the results.

The following is a list of sub-database problems, and precautions

    1. Sub-database sub-dimension problem
      If a user buys a product and needs to save the transaction, if the user's latitude is divided by the table, each user's transaction is saved in the same table, so it is very convenient to find a user's purchase situation quickly. However, the purchase of a commodity is likely to be distributed in multiple tables, which is more troublesome to find. Conversely, according to the commodity dimension of the table, can be very convenient to find the purchase of this item, but to find out the buyer's transaction record is more troublesome.
      So the common solution is:
      A. By the way of the sweep table, this method is basically impossible, the efficiency is too low.
      B. Record two data, one according to the latitude of the user table, a copy according to the dimensions of the commodity table.
      C. Through search engine resolution, but if the real-time requirements are very high, but also related to real-time search.
    2. problems with federated queries
      Federated queries are basically not possible because the associated tables may not be in the same database.
    3. Avoid cross-Library transactions
      Avoid modifying tables in the db0 while modifying the tables in the DB1, one of which is more complex to operate and has a certain effect on efficiency.
    4. try to put the same set of data on the same DB server
      For example, put seller A's goods and transaction information into Db0, when DB1 hangs, seller a related things can be used normally. This means that the data in the database is not dependent on the data in another database.
      One master multiple
      in practical applications, the vast majority of cases are read far beyond writing. MySQL provides a mechanism for read and write separation, all write operations must correspond to master, read operations can be performed on the master and slave machines, slave is identical to the structure of master, a master can have multiple slave, Even under the slave can hang slave, in this way can effectively improve the DB cluster of QPS.
      All write operations are first on the master, and then synchronized to the slave, so the synchronization from master to slave machine has a certain delay, when the system is very busy, the delay problem will be more serious, the increase in the number of slave machines will also make this problem more serious.
      In addition, you can see that master is the bottleneck of the cluster, when too many write operations can seriously affect the stability of master, if master hangs, the entire cluster will not work properly.
      Therefore, 1. When the reading pressure is very high, consider adding a fractional solution to the slave machine, but when the slave machine reaches a certain number, it has to consider the sub-Library. 2. When writing pressure is very high, it is necessary to carry out the library operation.

Why do I need to divide a table for MySQL use?
Can be used to say where the MySQL, as long as the amount of data a large, immediately encounter a problem, to be divided into the database table.
Why do you want to divide the table with a question? Can't mysql handle a big watch?
It is a large table that can be processed. I have experienced projects in which the single table physically file size is more than 80G, with a single table record number above 500 million, and this table
belongs to a very nuclear table: a Friend relationship table.
But this is not the best way to say it. There are also many problems with file systems such as the Ext3 file system being larger than large files.
This level can be replaced with the XFS file system. But MySQL single table too big after one problem is not good to solve: table structure adjustment related operation base
This is not possible. Therefore, the large items in use will be in the face of the application of sub-database sub-table.
From InnoDB itself to the data file btree on only two locks, leaf node lock and child node lock, you can want to know, when the occurrence of page splitting or adding
New leaves will cause the table to not write data.
So the sub-database table is a better choice.
So how much is the Sub-Library table appropriate?
After testing in a single table 10 million records, write read performance is relatively good. This leaves the buffer, then the single table is all the data font is kept in
8 million records below, a single table with character type remains below 5 million.
If you plan by 100 library 100 tables, such as user business:
5 million 100 = 500 billion = 500 billion record.
There is a number in mind, according to business planning or relatively easy.

From the network

MySQL uses the Sub-Library sub-table

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.