Several mysqlcluster schemes for actual combat experience

Last Update:2016-08-03 Source: Internet

Author: User

Tags serialization percona haproxy

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Source: keeplearning's Column

Http://www.2cto.com/database/201504/387166.html

1. Background

MySQL's cluster solution has a lot of official and third-party choices, and choosing more is a hassle, so we consider the MySQL database to meet the next three requirements and explore viable solutions on the market:

High availability: Automatically switches to backup server scalability after a primary server failure: It is easy to increase the DB server load balancer through scripting: support for manually switching a company's data request to another server, which company's data services can be configured to access which server

Need to choose a solution to meet the above requirements. The pros and cons of several solutions are referenced on the MySQL official website:

Seoul a free trip, review the reviews more confidently "click into" TripAdvisor, to provide you with travel reviews from around the world. Let's travel abroad to eat and drink more peace of mind! view

In consideration, it was decided to use the MySQL fabric and MySQL cluster scheme, as well as another more mature clustering scheme Galera cluster for pre-research.

2.MySQLCluster

Brief introduction:

MySQL Cluster is the MySQL official cluster deployment program, it's history is longer. Support for read-write extensions through automatic sharding, backup of redundant data in real-time, is the most available scenario, claiming 99.999% availability.

Architecture and implementation principles:

2016 high-paying professional web front-end development training! "Click to enter" 4.5 months from beginner to proficient! Average monthly salary 12000 yuan! View

MySQL cluster consists mainly of three types of services:

NDB Management Server: The Management Server is primarily used to manage other types of nodes in the cluster (Data node and SQL node), through which node information can be configured to start and stop node. SQL node: In mysql cluster, a SQL node is a MySQL server process that uses the NDB engine for external applications to provide access to the cluster data. Data Node: Used to store cluster data, and the system will try to keep it in memory.

your Zodiac is a poor life, or a rich life! "Click to enter" your Zodiac is the life of a poor life, or a rich life! The zodiac Buddha changes your horoscope! View your Zodiac is a poor life, or a rich life! "Click to enter" your Zodiac is the life of a poor life, or a rich life! The zodiac Buddha changes your horoscope! View

Disadvantages and Limitations:

The tables that need to be fragmented need to be modified by the engine InnoDB to NDB, and no shards are required to be modified. The transaction isolation level of the NDB supports only read Committed, which means that a transaction cannot query for changes made within a transaction before committing, and InnoDB supports all transaction isolation levels, using repeatable Read by default, without this problem. FOREIGN key support: Although the latest cluster version already supports foreign keys, there is a problem with performance (because the records associated with the foreign key may be in another Shard node), it is recommended that all foreign keys be removed. Data node nodes are kept in memory as much as possible, with large memory requirements.

The database system provides four levels of transaction isolation:
A.serializable (serialization): One transaction does not completely see the updates made to the database by other transactions during execution (transaction execution does not allow other transactions to execute concurrently.) Transaction serialization is performed, and transactions can be executed one after the other, not concurrently. ）。
B.repeatable Read (Repeatable Read): One transaction can see newly inserted records that other transactions have committed during execution, but cannot see updates to existing records from other transactions.
C.read commited (Read committed data): One transaction can see the newly inserted records that other transactions have committed during execution, and can see updates to existing records that have been committed by other transactions.
D.read UNCOMMITTED (READ UNCOMMITTED data): One transaction can see the newly inserted records not submitted by other transactions during execution, and can see updates to existing records that are not committed by other transactions.

3.MySQL Fabric

Brief introduction:

In order to implement and facilitate the management of MySQL shards and to enable highly available deployments, Oracle launched a MySQL Fabric in May 2014 with high expectations for managing MySQL services, providing a scalable and easy-to-use system, Fabric currently implements two features: high availability and the use of data shards for scalability and load balancing, both of which can be used alone or in combination with each other.

MySQL Fabric uses a series of Python scripting implementations.

Application case: As the scheme was launched last year, there are currently no applications for large companies in the online search.

Architecture and implementation principles:

The architecture diagram for fabric support for high availability is as follows:

Fabric uses HA groups for high availability, one of which is the primary server, the other is the backup server, and the backup server implements data redundancy through synchronous replication. The application uses a specific driver to connect to the fabric's connector component, and when the primary server fails, connector automatically upgrades one of the backup servers as the primary server, and the application does not need to be modified.

Fabric supports scalability and load balancing in the following architectures:

Use multiple HA groups to implement sharding, sharing different shard data between groups (the data within the group is redundant, as already mentioned in high availability)
The application simply sends statements such as query and insert to connector, connector automatically assigns the data to each group through Mastergroup, or combines the qualifying data from each group to return to the application.

Disadvantages and Limitations:
The two limitations that affect the larger one are:

The self-growth key cannot be a key for a shard; transactions and queries only support within the same shard, the data that is updated in a transaction cannot span shards, and the data returned by the query statement cannot span shards.

Testing for high Availability

Server architecture:

Function	Ip	Port
Backing store (save each server configuration information)	200.200.168.24	3306
Fabric Management process (Connector)	200.200.168.24	32274
HA Group 1--Master	200.200.168.23	3306
HA Group 1--Slave	200.200.168.25	3306

The installation process is omitted, and the following procedures describe how to set up a high availability group, add a backup server, etc.

First, create a highly available group, such as the group name Group_id-1, command:

Mysqlfabric Group Create Group_id-1

Add machine 200.200.168.25 and 200.200 to group Group_id-1.168.23:

Mysqlfabric Group Add group_id-1 200.200.168.25:3306

Mysqlfabric Group Add group_id-1 200.200.168.23:3306

Then check the machine status in the group:

Since the primary server is not set, the status of the two services is secondary
Promote one of the primary servers:
Mysqlfabric Group Promote group_id-1--slave_id 00F9831F-D602-11E3-B65E-0800271119CB
Then check the status:

The service that is set as the primary server has become primary.
In addition, the Mode property indicates that the server is read-write (read_write), or read-only (READ_ONLY), and read-only indicates the pressure to distribute the query data, only the primary server can be set to read/write (read_write).
This checks the slave status of the 25 server:

Can see that its primary server has pointed to 23

Then activate the fail auto Switch function:
Mysqlfabric Group Activate Group_id-1
Test the high availability of the service after activation
First, perform a state test:
Stopping the primary server 23

Then check the status:

As you can see, 25 is automatically promoted to the primary server at this point.
However, if you restore 23, you will need to manually reset the 23 master server.

Real-time testing:
Purpose: Test how often the backup server displays data after the primary service has updated the data
Test Case: Build a connection using Java code, insert 100 records into a table and see how long the backup server can synchronize the 100 data
Test results:
There are 101 data in the table, and after running the program, look at the number of data bars for the primary server:

The main server is obviously updated immediately.

To view the number of data bars for the backup server:

However, the backup server waited 1-2 minutes for the synchronization to complete (you can see that fabric is using asynchronous replication, which is the default, with better performance, the primary server does not have to wait for the backup server to return, but the synchronization is slow)

The following solutions are available for synchronizing data stability issues from the server:

Use semi-synchronization to enhance data consistency: Asynchronous replication can provide better performance, but the main library simply sends the Binlog log to the slave library, and the action ends without verifying that the library has received a higher risk. Semi-synchronous replication is returned after it is sent to the slave library and waits for a confirmation message to be sent from the library. You can set how synchronization logs are updated from the library, thereby reducing the latency of synchronization from the library and speeding up synchronization. To install semi-synchronous replication:
Running in MySQL
Install plugin rpl_semi_sync_master soname ' semisync_master.so ';
Install plugin rpl_semi_sync_slave soname ' semisync_slave.so ';
SET GLOBAL Rpl_semi_sync_master_enabled=on;
SET GLOBAL Rpl_semi_sync_slave_enabled=on;
Modify MY.CNF:
Rpl_semi_sync_master_enabled=1
Rpl_semi_sync_slave_enabled=1
Sync_relay_log=1
Sync_relay_log_info=1
Sync_master_info=1

Stability Test:
Test Case: Use Java code to build a connection to a table to insert 1w Records, the insertion process will be the master server, to see if the backup server has this 1w pen record
After the test results, the Java program throws an exception after stopping the primary server:

However, the SQL command is sent again and can be returned successfully. The proof was that the transaction had failed. The connection is switched to the backup server and is still available.
Flipping through the MySQL documentation, there are chapters explaining the problem:

It says: When the primary server, our application does not need to make any changes, but before the primary server is replaced by the backup server, some transactions will be lost, which can be handled as a normal MySQL error.

Data integrity check:
The backup server is able to synchronize all data when the primary server is tested to stop.
After restarting the primary server, view the number of records

You can see that it was stopped after inserting 1059 records.

Now look at the number of records for the backup server, and see if all of the data can be synchronized after the primary server is in the machine.

After a few 10 seconds, the data is not synchronized immediately, but not lost.

1.2. Sharding: How to support scalability and load balancing

Fabric Shard Introduction: When a machine or a group can not withstand the pressure of service, add the server to allocate read and write pressure, through the Fabirc shard function can be scattered in some tables of data storage to different servers. We can set the rules for assigning data stores by setting the Shard key set in the table. In addition, some table data may not require shard storage, you need to store the entire table on the same server, you can set up a global group to store the data, and the data stored in the global group is automatically copied to all other shard groups.

4.Galera Cluster

Brief introduction:

Galera cluster claims to be the world's most advanced Open source database cluster solution

Main advantages and Characteristics:

A true multi-master service model: Multiple services can be read and written at the same time, unlike fabric, where some services can only be backed up with synchronous replication: No latency replication, no data loss hot spares: When a server is down, the standby server automatically takes over and does not generate any time-of-day auto-expansion nodes: When you add a server, No need to manually copy the database to the new node support the InnoDB engine is transparent to the application: The application does not need to be modified

Architecture and implementation principles:
First, let's look at the traditional architecture diagram based on MySQL Replication (replication):

The replication method is to copy the update log from the primary server by initiating the replication thread, so that it is executed on the backup server in such a way that there is a risk that the transaction is lost and the synchronization is not timely. Fabric and traditional master-slave replication are all using this implementation.

Galera uses the following architecture to ensure the consistency of transactions across all machines:

The client accesses the database through Galera Load balancer, and every transaction committed is executed through the Wsrep API on all servers, or all servers are executed successfully, or all are rolled back to ensure data consistency for all services, and all servers are synchronized in real time.

Disadvantages and Limitations:

Because the same transaction needs to be performed on multiple machines in the cluster, network transmission and concurrent execution can result in some performance depletion. The same data is stored on all machines and is fully redundant. If a machine as a primary server, but also as a backup server, optimistic locking results in the probability of rollback will increase, writing programs should be careful. Unsupported Sql:lock/unlock Tables/get_lock (), Release_lock () ... Xa Transaction not supported
Currently, there are three implementations based on Galera Cluster: Galera Cluster for MySQL, Percona XtraDB Cluster, MariaDB galera Cluster.
We use the more mature, the use case more Percona XtraDB Cluster.
Application Case:
More than 2000 foreign companies use:

Including:

Cluster deployment architecture:

Function	Ip	Port
Backing store (save each server configuration information)	200.200.168.24	3306
Fabric Management process (Connector)	200.200.168.24	32274
HA Master 1	200.200.168.24	3306
HA Master 2	200.200.168.25	3306
HA Master 3	200.200.168.23	3306

4.1. Test Data synchronization

Create a table on machine 24:

Immediately viewed in 25, visible has been created synchronously

Inserting 100 records on a 24 server using Java code

View the number of records on a 25 server now

Visible data synchronization is effective immediately.

4.2. Test Add cluster node
The steps to add a cluster node are simple, as long as the Percona XtraDB Cluster is deployed on the newly added machine and then started, the system will automatically sync the data from the existing cluster to the new machine.

Now for testing, stop one of the node services:

Then insert 100W data on the cluster using Java code

To view the database size of 100w data:

At this point, another node is started and the cluster's data is automatically synchronized at startup:

Start only about 20 seconds, check the data size consistent, view the number of table records, has been synchronized.

5. Comparative summary

	MySQL Fabric	Galera Cluster
Use case	Launched in May 2014, currently in the Internet has no search for large companies in the application case	The scheme is more mature, and many foreign internet companies use
The real-time nature of data backup	Because asynchronous replication is used, the general delay is a few 10 seconds, but the data is not lost.	Real-time synchronization with no data loss
Data redundancy	With sharding, you can distribute different data from the same table across multiple machines by setting the Shard key rule	Full redundancy for each node, no shards
High Availability	Automatic switchover after the primary server is implemented through fabric connector, but may not be able to query data immediately after switching due to backup latency	Implemented using Haproxy. Switching is more available due to real-time synchronization.
Scalability	After adding a node, you need to manually copy the cluster data first	Easy to expand nodes, automatically synchronize cluster data when starting node, 100w data (100M) only about 20 seconds
Load Balancing	Implemented through Hasharding	Using Haproxy for load balancing
Program modification	JDBC classes and URLs that need to be switched to Jdbc:mysql:fabric	The program does not need modification
Performance comparison	Use Java to insert 100 records directly with JDBC, probably 2000+ms	As with the direct operation of MySQL, directly using JDBC to insert 100 records, about 600MS

6. Practical Application

Considering the advantages and disadvantages of the above scheme, we prefer to choose Galera if there are only two database servers, consider using the following database architecture for high availability, load balancing, and dynamic scaling:

If three machines can be considered:

Several mysqlcluster schemes for actual combat experience

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More