Data storage Evolution Idea one: library Tanku
A library single table is the most common database design, for example, a user table is placed in database db, and all users can find it in the user table in the DB Library.
The idea of data storage Evolution two: library multiple tables
As the number of users increases, the amount of data in the user table becomes larger, and when the amount of data reaches a certain level, the query to the user table slowly slows down, affecting the performance of the entire DB. If you use MySQL, there is a more serious problem is that when you need to add a column, MySQL locks the table, all the read and write operations can only wait.
Can be in some way the user to the level of segmentation, resulting in two table structure exactly the same user_0000,user_0001 table, user_0000 + user_0001 + ... The data is just a complete piece of data.
The idea of data storage Evolution three: Multi-Library multiple tables
As the amount of data increased, perhaps a single db storage space is not enough, as the query volume increased single database server has no way to support. This is the time to make a horizontal distinction between the databases.
MySQL Database sub-table rules
When you design a table, you need to determine what rules this table is based on to separate the tables. For example, when a new user is available, the program has to determine which table to add the user information to, and, similarly, when logging in, we have to find the corresponding records in the database through the user's account, all of which need to follow a certain rule.
Routing
The process of finding the corresponding tables and libraries through the sub-table rules. such as the rules of the USER_ID mod 4, when the user registered a new account, account ID 123, we can through the ID mod 4 to determine that this account should be saved to the user_0003 table. When the user 123 logged in, we decided to record in user_0003 after 123 mod 4.
The following are the issues arising from the Sub-Library table, and considerations
1. Question of the dimension of the Sub-pool table
If a user buys a product, the transaction needs to be saved. If the user's latitude is divided into tables, each user's transaction is saved in the same table, so it is easy to find a user's purchase, but the purchase of a product is likely to be distributed in more than one table, to find more trouble. On the contrary, according to the commodity dimension of the table, you can easily find the purchase of this product, but to find a buyer's transaction is more troublesome.
So the common solutions are:
A. By sweeping the table, this method is basically impossible and inefficient.
B. Record two data, one according to the user latitude, one according to the commodity dimension of the table.
C. Through the search engine solution, but if the real-time requirements are high, but also related to real-time search.
2. Question of joint queries
Federated queries are basically impossible because the associated tables may not be in the same database.
3. Avoid cross-Library transactions
Avoid modifying tables in the DB0 in one transaction while modifying the tables in the DB1, one is more complex to operate and has a certain effect on efficiency.
4. Try to put the same set of data on the same DB server
For example, the seller A's merchandise and transaction information are placed in the db0, when the db1 hung up, seller a related things can be used normally. This means that the data in the database is not dependent on the data in the other database.
A master is more than prepared
In practical applications, most of the cases are read much more than writing. MySQL provides a read-write separation mechanism, all write operations must correspond to master, read operations can be done on master and slave machines, slave and master structure exactly the same, a master can have multiple slave, Even under the slave can hang slave, through this way can effectively improve the DB cluster QPS.
All writes are done first on master, then synchronized to the slave, so there is a certain delay from master sync to slave machine, when the system is busy, the delay problem will be more serious, the increase of slave machine will also make this problem more serious.
In addition, it can be seen that master is the bottleneck of the cluster, when too much write operation, will seriously affect the stability of master, if master hang off, the entire cluster will not work properly.
So, 1. When the reading pressure is very big, may consider adds the slave machine the fractional solution, but when slave machine achieves certain quantity must consider the branch storehouse. 2. When writing pressure is very high, you must have to do the library operation.
Why do MySQL use to divide the table?
can use to say to use MySQL place, as long as the data quantity is big, will encounter a problem immediately, want to divide the storehouse to divide a table.
Here's a question. Why do I have to divide the tables? Can MySQL handle a large table?
is actually able to handle the big table. I have experienced a project in which the Tanku physical file size is more than 80G, the single table record number is above 500 million, and this table
Belong to a very nuclear table: a Friend relationship table.
But this is not the best way to say it. There are also many problems facing file systems such as EXT3 file systems that are larger than large files.
This level can be replaced with an XFS file system. But the MySQL sheet is too big after a problem is not resolved: Table structure adjustment related to the operation of the base
Ben is out of the question. Therefore, the use of large items in the application of the Sub-Library table.
From the InnoDB itself, there are only two locks on the btree of the data file, leaf node locks and child node locks, which you can think about when the page splits up or adds
The new leaf will cause the table to be unable to write data.
So the Sub-Library table is a better choice.
So how appropriate is the Sub-library table?
After testing in a single table 10 million records, write read performance is relatively good. So in the left buffer, then the single table is all the data font maintained in the
Below 8 million records, a single table with a character type remains below 5 million.
If you plan according to 100 library 100 tables, such as user business:
5 million *100*100 = 500 billion = 500 billion records.
There is a number in mind, according to business planning is relatively easy.