Database Optimization-horizontal segmentation-and Application in actual projects, splitting projects
Horizontal database partitioning, relatively vertical partitioning, requires more work and tasks, but it is necessary for tables with a particularly large number of rows of data.
During Continuous optimization in the BDC project, I summarized the following common methods for horizontal database Splitting:
1. Table Partitioning;
2. Table sharding;
3. Table sharding;
Table Partitioning
Table Partitioning is a very powerful feature in ORACLE and MYSQL databases of the new version. It is worth learning.
However, if the table partition is not used well, the performance will decrease. I remember another project team, they tried to use table partitions, so they arranged a developer with no relevant experience to study and study for a few days. Then they found that the speed was slower, therefore, the table partition plan is terminated. This is a digress.
After selecting the table Partition Scheme, the first thing to do is to build confidence that the table partition must be effective when processing big table data, if the effect is not obvious, the effect is ineffective, or even the effect is worse, you should first consider whether it is not suitable for use.
First, let's talk about the actual situation of our project. The initial size of the core telephone table is 0.12 billion. The solution we selected is Table Partitioning. The next step is based on the data, the data is differentiated by different dimensions to better serve the business.
Our table has a key content, that is, the region. Each telephone information corresponds to a region, which is a basic feature of our national telecommunications data.
Is it classified to the provincial level, or to the municipal level? We have 32 provincial administrative regions with data, and more than 300 prefecture-level cities. Partitions cannot be too fragmented. Unless necessary, O & M will be a headache in the future. Then we use the province-level zoning as the partition condition.
Partition design. If the partition is not reflected in SQL, it may reduce the efficiency.
Partitioning is relatively simple. One is to define the partition directly, for example
Select * from tel PARTITION (310000); PARTITION (310000) indicates the Data PARTITION in Shanghai, where 310000 is the National Standard code corresponding to Shanghai;
This is a way of writing. I personally prefer another method:
Select * from tel t where t. prov_code = 310000
The partition annotation is created on the prov_code field. ORACLE automatically calls the partition feature.
As a result, after a similar partition transformation, the peak throughput is from 4 million data in the previous day, the database will be suspended, up to 12 million data is processed in one day, and the database seems to have crashes, the results are good.
Table partitioning involves a lot of concepts and functions, so you can sort out this content when you are free.
Table sharding
Table sharding is an old and practical technology. We did China Telecom yellow pages a long time ago. At that time, Table Partitioning was not mature. The most common technology we used to process large tables was table splitting.
Table splitting is logically simple. We have a 90 million Data Book table that contains the number book data of 31 provincial companies, which are maintained by 31 provincial companies at the same time.
90 million of data, whether retrieval, modification, or deletion, has a high impact on performance. You can split the data by province. 31 tables are created to replace this one, and a national view is created for the group.
Table sharding
We mentioned in Table partitions that a project team next door failed to use table partitions, but living people could not be suffocated. They also found a suitable method. This method is table sharding.
What is table sharding? In fact, our database Master/Slave mechanism is the prototype of a table/sub-database. How did they do this? First, prepare hardware, purchase a set of Database hardware, and copy the existing database to the new hardware to clean up the business data, keep constant data, and then make simple adjustments to the system. The south of the Yangtze River enters the old system and accesses the old database. The north of the Yangtze River enters the new system and accesses the new database.
In addition to spending a little money, the risk of transformation is indeed quite small. Of course, not all systems can split databases like this.
To sum up, when the database reaches a certain scale, Vertical Split and horizontal split are all issues that must be considered by the project owner. No matter which method is used, the project owner must understand the business, select the most appropriate method based on the actual situation of the system.