Data splitting and integration solutions

Source: Internet
Author: User

Use MySQL proxy for data splitting and integration

MySQL proxy is a database proxy layer product officially provided by MySQL. Like MySQL server, it is also an open source product based on the GPL open source protocol. It can be used to monitor, analyze, or transmit communication information between them. Its flexibility allows you to use it to the maximum extent. Currently, it provides connection routing, query analysis, query filtering and modification, Server Load balancer, and basic ha mechanisms.

In fact, MySQL proxy does not have all of the above functions, but provides the basis for implementing the above functions. To implement these functions, we also need to write our own Lua script.

MySQL proxy actually establishes a connection pool between the client request and the MySQL server. All client requests are sent to the MySQL proxy and then analyzed through the MySQL proxy to determine whether it is a read or write operation and distribute it to the corresponding MySQL server. Multi-node slave clusters can also achieve load balancing. For example, the basic architecture of MySQL proxy (Figure 14-4 ):

 
Figure 14-4 MySQL proxy Architecture

Through the architecture diagram above, we can clearly see the position of MySQL proxy in practical application and the basic things that can be done. Detailed Implementation Rules of MySQL proxy are described in detail and examples in the MySQL official documentation. Interested readers can download them from the MySQL official website for free or read them online.

Use amoeba for data splitting and integration

Amoeba is developed based on Java and focuses on proxy integration for distributed database data sources.ProgramOpen-source framework, based on the gpl3 open-source protocol. Currently, amoeba supports query routing, query filtering, read/write splitting, load balancing, and HA mechanisms, as shown in Figure 14-5.

Amoeba mainly solves the following problems:

(1) integrate complex data sources after data splitting;

(2) provide data splitting rules and reduce the impact of data splitting rules on the database;

(3) Reduce the number of connections between the database and the client;

(4) read/write splitting route.

 
Figure 14-5 amoeba for MySQL Architecture

We can see that what amoeba does is exactly what is needed to improve database scalability through data splitting.

Amoeba is not a proxy program at the proxy layer, but a framework for developing proxy programs at the database proxy layer. Currently, the proxy programs developed based on amoeba include amoeba for MySQL and Amoeba for Aladin.

Amoeba for MySQL is a solution dedicated to MySQL databases. The Protocol requested by the front-end application and the data source database connected to the backend must be MySQL. For any client application, amoeba for MySQL is no different from a MySQL database. Any client requests using the MySQL protocol can be parsed and processed by amoeba for MySQL. Amoeba for tells us the Architecture Information of amoeba for MySQL (from the amoeba developer blog ):

Amoeba for Aladin is a more widely used and powerful proxy program. It can connect to data sources of different databases at the same time to provide services for front-end applications, but only accept client application requests that comply with the MySQL protocol. That is, as long as the front-end application is connected through the MySQL protocol, amoeba for Aladin will automatically analyze the query statement, automatically identifies the type of database on which the query data source is located based on the data requested in the query statement. Amoeba for Aladdin architecture diagram (Figure 14-6) shows the architectural details of amoeba for Aladin (from the amoeba developer blog ).

At first glance, the two seem to be exactly the same. The main difference between the two is that after the MySQL protocal adapter is processed, the data source database is determined based on the analysis results, and then the specific JDBC driver and the corresponding protocol are selected to connect to the backend database.

In fact, you may have discovered the characteristics of amoeba through the above two architecture diagrams. It is just a development framework. Apart from choosing the products for MySQL and for Aladin that it already provides, you can also perform secondary development based on your own needs to obtain a proxy program that is more suitable for your application features.

However, for MySQL databases, both amoeba for MySQL and Amoeba for Aladin can be used well. Of course, considering the complexity of any system, the performance of the system will certainly suffer a certain loss, and the maintenance cost will naturally be higher. Therefore, we recommend that you use amoeba for MySQL when you only need to use the MySQL database.

Amoeba for MySQL is very simple to use. All configuration files are standard XML files, with a total of four, as shown below:

Amoeba. XML-the main configuration file that configures all data sources and the parameters of amoeba;

Rule. xml -- configure information about all query routing rules;

Functionmap. xml -- configure the JAVA Implementation class for parsing the functions in the query;

Rullfunctionmap. xml -- configure the implementation class of the specific function to be used in the routing rule.

 
Figure 14-6 amoeba for Aladdin Architecture

If your rules are not too complex, you can use only the first two of the above four configuration files to complete all the work. Common functions of proxy programs, such as read/write splitting and Server Load balancer, are configured in Amoeba. xml. In addition, amoeba supports automatic routing for vertical and horizontal data splitting. You can set routing rules in rule. xml.

At present, amoeba lacks the online management function and transaction support, which has been put forward during communication with related developers, we hope to provide a command line management tool for online maintenance and management to facilitate online maintenance and use. The feedback is that the management module has been included in the development agenda. In terms of transaction support, amoeba cannot do so for the time being. Even if the client application contains transaction information when submitting a request to Amoeba, amoeba ignores transaction-related information. Of course, after continuous improvement, I believe transaction support must be a key feature of amoeba.

For more detailed usage of amoeba, you can use the user manual provided in the amoeba developer blog (http: // amoeba. sf.net.

Use hivedb for data splitting and integration

Like MySQL proxy and amoeba, hivedb is also an open-source framework that provides data splitting and integration for MySQL databases based on Java. Currently, hivedb only supports horizontal data splitting. It mainly solves the scalability of databases with large data volumes and high-performance access to data. It also supports data redundancy and basic ha mechanisms.

The implementation mechanism of hivedb is different from that of MySQL proxy and amoeba. Instead of using the replication function of MySQL to achieve data redundancy, hivedb implements its own data redundancy mechanism, the underlying layer is mainly based on Hibernate shards for data splitting.

In hivedb, data is distributed to multiple MySQL servers through various custom partition keys. When a query request is run during access, the filter conditions are automatically analyzed, data is read from multiple MySQL servers in parallel, and the result set is merged and returned to the client application.

In terms of functions, hivedb may not be as powerful as MySQL proxy and amoeba, but its data splitting idea is essentially different from the previous two. In addition, hivedb is not just shared by open-source enthusiasts, but an open-source project supported by commercial companies.

The hivedb architecture on the official website of hivedb (Figure 14-7) describes how hivedb organizes basic information about data, although it cannot show detailed information about the architecture, however, it also shows its unique aspect of data splitting.

 
Figure 14-7 hivedb architecture Diagram

Other solutions for data splitting and integration

In addition to the data splitting and integration solutions described above, there are also many other solutions, such as the hscale that is further extended on the basis of MySQL proxy, spock proxy built through rails, and pathon-based pyshards.

No matter which solution you choose, the overall design concept should not be changed. That is, the overall database service capability should be enhanced through vertical and horizontal data splitting, improve the overall scalability of the application system as much as possible, and make the expansion method as convenient as possible.

As long as the middle layer proxy application can better solve the problem of data splitting and Data Source Integration, the linear scalability of the database will be as convenient as the application: as long as you add a cheap PC server, you can linearly increase the overall service capabilities of the Database Cluster, so that the database does not easily become the performance bottleneck of the application system.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.