Mysql encounters horizontal table sharding when the data volume is large.
1. Split tables based on business attributes
This table sharding algorithm is roughly modulo, hash, md5, and so on.
If you want to split a table using a business attribute and the business relationship is complex, other conditions must be associated with the attribute to be queried based on other conditions. The query condition must contain this attribute.
Example:
The user profile table is split horizontally based on the user ID modulo.
There are groups in the community, applications in the group, and various types of applications. You can split the table by group ID and Application ID.
Problem:
The table sharding attribute cannot be obtained during query based on a condition.
1) The condition contains table sharding information.
For example, if a user places an order on a website, we perform table sharding based on the user ID, so that the user can conveniently query the order associated with the user. However, when a user reports a complaint, the customer service needs to query the order according to the order number. The order number can contain table sharding information, such as splitting the order into 100 tables, there can be two orders in the order number to indicate which table the Order is in
2) store the corresponding association with key-value store
The principle is to use key value store for index tables.
3) data redundancy
Data redundancy can be performed on tables to be associated. This avoids queries.
Example:
Buy a gift. When purchasing a virtual gift, we split the table based on the purchaser's ID, and the order number also contains table sharding information. However, the user may also perform query based on the recipient. In this case, we can generate a record for the recipient's redundancy after the purchase is successful.
4) cache, NOSQL
Similar to data redundancy. In the case of table sharding for a group application, we have split tables by group ID and application type. However, it is difficult for me to query all types of applications recently. We need to query all the application types of the group, sort the group, and pagination. In fact, you can store hundreds of applications in the cache mode.
2. Split the table by Time
When the table relationship is complex, you cannot perform table sharding based on a dimension. However, there is obvious timeliness.
Example:
I think everyone will use meager resources. The meager content sent by someone will be pushed to thousands of households. Therefore, you cannot perform table sharding queries based on the user ID. However, meager values have a strong timeliness. The default dynamic information of a year ago will not be concerned any more. We split the meager table by time, one table in three months. The row-level cache (memcached) is only stored for one month. The user's meager inbox (meager ID list) is generally limited. When the cache server is restarted or does not hit, you need to query Mysql and mysql by time and table. When the cache does not hit, most of the cases are the meager results of the last three months. So we can store our meager resources on database servers with good physical resources in the last year.
3. Split the table based on the auto-increment ID
This split method is not a modulo score, but a certain amount of data is stored in each table. If the data volume reaches, it is stored in the new table. In this way, the data volume of each table can be fully controlled. The relationship is very simple and time-sensitive.
4. data migration methods
When some data is queried for a long time ago, it is rarely queried. For example, the employee payroll can only be saved for this year's salary. Historical data can be migrated to a salary_old table to prevent data loss. But it can also be used for query.
The principle of database sharding is similar.
By the zeroth Space