Mycat Enlightenment: The Evolution of database architecture in distributed systems

Last Update:2018-06-19 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Mycat is a database sub-Library sub-table middleware, the use of Mycat can be very convenient to implement database sub-table query, and reduce the business code in the project. Today, we will introduce the background of the birth of Mycat through the evolution of the database architecture, and the role Mycat plays in it, so that we can have a deep understanding of the birth and function of Mycat.

Single Database schema

A project in the early days, in order to validate the market as quickly as possible, its greatest requirement for business systems is rapid implementation. At this stage, code developers in order to quickly implement business systems, generally all levels (MVC) business code is written in the same project, all business data are stored in the same database. At this point, the overall schema diagram for the project is as follows:

As you can see, we have centralized the business code for registering, logging in, and shopping for three modules in a project, and all three business modules read the same business database.

However, with the continuous development of the project, the user volume is growing, the single application server has been unable to withstand such a large amount of traffic. At this point, the common practice is to distribute the project distributed, spread the traffic of a single server, so as to temporarily alleviate the application server pressure caused by user growth. The project schema diagram at this point is as follows:

But as we deploy more and more application servers, the back end of a single database server is no longer able to withstand such a large amount of traffic. In order to alleviate the user's access pressure as soon as possible, we usually add one more cache layer between the application server and the database server, and the cache can cancel out some of the database query operations. The project schema diagram at this point is as follows:

Distributed deployment-Cache-Single database architecture

However, increasing the database cache layer can only alleviate the database access pressure and intercept some database access requests. With the further increase of user access, the bottleneck of database access will be further highlighted. This time, we have to transform the architecture of the data layer.

Master-Slave Database architecture

This time the common solution is to turn the original single database server into a master-slave database server, that is, a database as the main library to support the writing of data, a database as read library support query data. The schema diagram for this project is as follows:

We realize the read and write separation through the database master-slave synchronization, and all the read operations are directed to the library, and all the write operations are directed to the main library.

Because we have modified the database layer to require all read database operations to access from the library, all write database operations to access the main library, then we have to transform the original code.

Public User Selectuser () {
Datatemplate.selectbyid (...);
}
Public User Insertuser () {
Datatemplate.insert (user);
}
Above is the pre-transformation code, whether it is read or write operations, we use the same data source to operate. But in order to adapt to the new database schema, we must manually determine which data source should be requested in the code.

Public User Selectuser () {
Readtemplate.selectbyid (...);
}
Public User Insertuser () {
Writetemplate.insert (user);
}
After the modified code, the development is based on its own experience to determine which data source should be selected for operation. When it is a read operation, we choose Readtemplate. When it is a write operation, we choose Writetemplate.

But as a programmer, we vaguely feel that identifying which data source should be used should not be judged manually, but should be automatically let the code to judge. After all, the pattern of judgment is simple-if it is a select then read the data source, if it is other then write the data source.

In fact, this is one of the uses of Mycat, that is, as a database middleware to solve the problem of data source judgment. If we use Mycat as the database middleware, then we don't need to care about which data source I should use. Mycat help us to block the differences between different data sources, for us there is only one data source, the data source can handle the write operation, can also handle the read operation. The above query and insert code can become the following:

Public User Selectuser () {
Datatemplate.selectbyid (...);
}
Public User Insertuser () {
Datatemplate.insert (user);
}
Implement the master-slave database architecture, and then use Mycat, you find that we do not need to modify too much code, only need to change the data source to Mycat address. Mycat automatically sends all of our statements to the backend MySQL server.

When we use the master-slave database architecture, we will find that we can support more user access and requests. However, with the further development of the business, it can be found that there are some problems:

When we modify the registration module, we need to publish the entire project, which will affect the login, the normal use of the shopping module.

Even if the code for each change is small, we need to publish the entire project package, which makes the code package for each release very large.

With the increasing volume of business, we will find that even if the master-slave read and write separation, the database pressure is very large, it seems to be unable to bear.

These are just some of the problems encountered in the actual combat, in fact, the problem will only be more not less, and as the business continues to grow more prominent.

Vertically segmenting database schemas

At this time in order to each business module does not affect each other, we put the application layer vertically split, that is, the registration module, login module, shopping module are separate as an application system, read and write independent database server. At this point, our system architecture diagram looks like this:

After the vertical split, we can successfully solve the three problems mentioned above: Business module interaction, single database pressure problem.

But with the further expansion of business, we have added many business modules: Customer Service Module, wallet module, Personal center module, collection module, order module, etc. In accordance with the database schema we designed earlier, we have a number of data sources that are scattered across projects:

User Database 192.168.0.1

Commodity Database 192.168.0.2

SMS Database 192.168.0.3

Customer Service Database 192.168.0.4

Wallet Database 192.168.0.5

......

For a project manager, so many data sources scattered across different projects, how unified management is a problem. Many times it's hard to remember which database the project is connected to, and which database the project is connected to.

But if you use Mycat as a database middleware, MYCAT can help you solve this problem. For all projects, they only need to have a unified connection to an address provided externally by Mycat, while Mycat helps these projects contact all back-end MySQL databases. For the front-end project two said, they only know mycat this database middleware, and do not need to pay attention to what database I connected, mycat through their own configuration can do this task.

Redundant code for the table, which allows developers to focus more on the development of business logic.

Horizontally segmenting the database schema

When the database schema has undergone a master-slave architecture and a vertical split architecture, there is no problem in dealing with general business reading and writing. But for some core business data, there may still be bottlenecks, such as user modules.

For some users up to 100 million of the user system, even after the master-slave architecture, vertical split architecture optimization, but the user database of the individual table needs to store the data is up to 100 million of the size. If we put all the data in a table, whether it is the insertion of data at the time of registration, or the query data at the time of landing, it will inevitably become very slow.

At this point, we have to split these high-volume core business tables horizontally, and the massive data records will be split into multiple tables to be saved. For example, we might have only one user table at the beginning, and we would split the user table by the username of 1000, then we would have 1000 tables, user_000 through user_999. At this point, the project's schema diagram looks like this:

When we query the user data in the code, we first determine the table according to the user ID, and then query the corresponding table. For example, a user with a userid of 90749738 should query the User_38 table, and a user with a userid of 74847383 should query the user_83 table.

Through horizontal split, we successfully solved the problem of reading and writing bottleneck of the massive data core business table. But at this point there is a problem at the code level, that is, we need to query the database, according to the UserId to determine which table should be queried, this operation for all business modules are highly consistent, should be drawn out into a common project.

Consistent with judging whether you should use read data sources or write data sources, we all feel that such mechanical tasks should not be left to programmers and should be made available to the machine. This is actually what mycat can do for us: Mycat by configuring a series of sub-list rules, let Mycat help us automatically determine which sub-table should be queried. By using MYCAT database middleware, we can eliminate the redundant code at the code level to determine which table to query, thus allowing developers to focus more on the development of business logic.

Summarize

From a single database schema to a master-slave read-write detached database schema, to a vertically split, horizontally split database schema. We can see that mycat helped us solve the problem of three mechanical repeatability, such as reading and writing data source judgment, complex data source address, and table judging.

But Mycat has developed so far, its function has far exceeded the above mentioned three. For example, Mycat supports the master-slave switching function, when the database main library network problems or other failures, Mycat can automatically switch to the slave library, so as to ensure the normal reading and writing functions. Mycat's positioning is a database middleware, Mycat can do anything between the application layer and the data layer.

Recommend an exchange study skirt: 69-7-57-9-7-5-1 inside will share some senior architect recorded video recording: Spring,mybatis,netty source analysis, high concurrency, performance, distributed, micro-service architecture principle, JVM performance Optimization These become the necessary knowledge systems for architects. You can also receive free learning resources, and now benefit from:

Through this article, we understand the background of the birth of Mycat and its most basic role, then in the next article we will use a minimal available demo to use the Mycat.

Mycat Enlightenment: The Evolution of database architecture in distributed systems

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Mycat Enlightenment: The Evolution of database architecture in distributed systems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Mycat Enlightenment: The Evolution of database architecture in distributed systems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support