Sub-database sub-table

Source: Internet
Author: User
Tags mysql index

The traditional sub-database sub-table

The traditional sub-database is implemented by Application layer logic, which is common tables and libraries for the database level.
Sub-Library
Why the library is divided

First, in the case of a single database server with sufficient performance, the library has no effect on database performance. Database has only one namespace effect on the storage of databases. The table files in database are stored in a folder named after the database name. For example, the following employees database:

Mysql> show tables in employees;
+---------------------+
| Tables_in_employees |
+---------------------+
| Departments |
| Dept_emp |
| Dept_manager |
| Employees |
| Salaries |
| Titles |
+---------------------+

This is true in the operating system:


LS  /usr/local/var/mysql/employeesdb.opt           dept_emp.frm     dept_manager.ibd salaries.frm     Titles.ibddepartments.frm  dept_emp.ibd     employees.frm    salaries.ibddepartments.ibd  dept_ Manager.frm employees.ibd    titles.frm



Database is not a file, only play the role of namespace, so MySQL on the database size of course there is no limit, and there is no limit to the number of tables inside

So why divide the library?

The answer is to solve the performance problem of a single server , when a single database server can not support the current amount of data, it is necessary according to the business logic to close the table into a few points, respectively, placed in a different database server to reduce the load of a single server.

The Sub-library generally considers vertical slicing, unless the amount of data is still more than a single server can be loaded after vertical segmentation to continue to slice horizontally.

For example, a forum system database because the current server performance can not meet the needs of the sub-Library. First vertical segmentation, according to business logic to the user related data tables such as user information, points, user private messages and so put into the user database; Forum related data sheets such as plates, posts, replies, etc. into the forum database, two databases placed on different servers.

After splitting a table, it is often impossible to be completely unrelated, such as posting in a post or replying to a person that is in the user database. It is possible to get the reply of the current post, the person who posted it, the reply person and so on, before splitting it, it can only get the final data after multiple queries because the cross-database cannot be queried.

So summing up, the purpose of the library is to reduce the single server load, the principle of segmentation is based on the degree of business to split, the disadvantage is that cross-database can not be linked table query .


Sub-table
Reasons for the Sub-table

when the amount of data is large, the B-tree index will not work . Unless the index overwrites the query, the database server needs to query all eligible records according to the results of the index scan, and if the amount of data is large, this will result in a lot of random I/O, and the database response time will be unacceptably high. In addition, the cost of index maintenance (disk space, I/O operations) is also very high.

Vertical Sub-table

Reason:

1. Based on MySQL index implementation principle and the content of the relevant optimization strategy we know that the InnoDB Primary index leaf node stores all the information for the current row, so reducing the field allows memory to load more rows of data, which facilitates querying.

2. Limited by file size in the operating system.

Segmentation principle:

You can store more data in a table by splitting the fields that are not commonly used or business logic tightly or that store more content into a new table.


Horizontal Sub-table


Reason:

1. As the amount of data increases, the number of table rows is huge and the query becomes less efficient.

2. Also limited by the file size limit in the operating system, the amount of data can not be increased infinitely, when reaching a certain capacity, you need to slice horizontally to reduce the size of a single table (file).

Sharding principle: Incremental interval or hash or other business logic.

What kind of segmentation method to use is judged by the actual business logic.

For example, the access to the table is recently generated new data, historical data access less, you can consider the time increment according to a certain time period (such as annual) segmentation.

If you have a more uniform access to the table, there is no obvious hotspot area, you can consider using a range (such as 500w per table) or a normal hash or a consistent hash to slice.

Global PRIMARY Key issues:

Tables that originally relied on a database to generate primary keys (such as self-increment) need to implement their own primary key generation after splitting, because the general split rule is built on the primary key, so you need to determine the primary key when inserting new data before you can find the stored table.

In practical applications, there have been more mature schemes. For example, the self-increment table of the main key, Flickr global primary key generation scheme is a good solution to the performance and single point problem, the specific implementation principle can refer to this post. In addition, there are global primary key generation schemes similar to UUID, such as the ID generator of the Instagram reference.

Consistent hash:

The use of consistent hash segmentation is more extensible than normal hash segmentation, and can be used to add and delete split tables. Consistent hash of the specific principles can refer to this post, if the split table is stored on different server nodes, you can use the same as post to the node name or IP hash; If a split table exists in a server, the split table name can be hashed.

Sub-database sub-table

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.