Experience several MySQL cluster Solutions

Source: Internet
Author: User
Tags percona haproxy

Experience several MySQL cluster Solutions
1. Background

The cluster solution of MySQL has many official and third-party options, which is a headache. Therefore, we consider that the MySQL database can meet the following three requirements and investigate feasible solutions on the market:

High Availability: the master server can be automatically switched to the backup server after a fault occurs. Scalability: the DB Server Load balancer can be easily added through scripts. Server Load balancer supports manual switching of data requests from a company to another server, which companies can be configured for data services to access?

You need to select a solution to meet the above requirements. The advantages and disadvantages of several solutions are described on the official MySQL Website:

 

Taking into consideration, we decided to adopt the MySQL Fabric and MySQL Cluster solutions, as well as another mature Cluster solution Galera Cluster for pre-research.

 

2. MySQLCluster

Introduction:

MySQL Cluster is the official MySQL Cluster deployment solution, which has a long history. Supports read/write scaling through automatic sharding, and backs up redundant data in real time. This is the most available solution and claims to achieve 99.999% availability.

Architecture and implementation principles:

 

MySQL cluster consists of three types of services:

 

NDB Management Server: the Management Server is mainly used to manage other types of nodes (Data Node and SQL Node) in the cluster. It can be used to configure Node information to start and stop nodes. SQL Node: in MySQL Cluster, an SQL Node is a mysql server process using the NDB engine. It is used for external applications to provide access to Cluster data. Data Node: used to store cluster Data. The system will try to store the Data in the memory.

 

 

Disadvantages and restrictions:

 

For tables that need to be sharded, the engine Innodb needs to be modified to NDB. You do not need to modify the table that does not need to be sharded. The transaction isolation level of NDB only supports Read Committed, that is, a transaction cannot query the modifications made in the transaction before it is Committed. Innodb supports all transaction isolation levels. By default, Repeatable Read is used, this problem does not exist. Foreign key support: although the latest Cluster version already supports foreign keys, the performance is poor (because the records associated with foreign keys may be in other partition nodes ), we recommend that you remove all foreign keys. The Data on the Data Node is stored in the memory whenever possible, with high memory requirements.

 

The database system provides four transaction isolation levels:
A. Serializable: A transaction does not see any updates made by other transactions to the database during execution (other transactions cannot be concurrently executed when the transaction is executed. The transaction is serialized and can only be executed one after another, rather than concurrently .).
B. Repeatable Read (Repeatable Read): during execution of a transaction, you can see the newly inserted records submitted by other transactions, but cannot see the updates of existing records by other transactions.
C. read Commited (Read committed data): During the execution of a transaction, you can see the newly inserted records of other transactions, you can also see the updates to existing records that have been committed by other transactions.
D. read Uncommitted (Read Uncommitted data): During the execution of a transaction, You can see records of New inserts not committed by other transactions, in addition, we can see the updates to existing records that are not committed by other transactions.

 

3. MySQL Fabric

 

Introduction:

To achieve and facilitate the management of MySQL partitions and achieve high-availability deployment, Oracle launched a MySQL product-MySQL Fabric, which has high hopes for all parties in May 2014, to manage MySQL services, fabric provides a scalable and easy-to-use system. Fabric implements two features: high scalability and data sharding for scalability and load balancing. These two features can be used independently or in combination.

MySQL Fabric is implemented using a series of python scripts.

Application Case: Since this solution was launched last year, no application cases of large companies have been found on the Internet.

 

Architecture and implementation principles:

Fabric supports the following architecture to achieve high availability:


Fabric uses HA groups for high availability. One of them is the master server and the other is the backup server. The Backup Server achieves data redundancy through synchronous replication. The application uses a specific driver to connect to the Fabric Connector component. When the master server fails, Connector automatically upgrades one of the backup servers as the master server, and the application does not need to be modified.

Fabric supports the following Architecture:

Use multiple HA groups to implement sharding. Each group shares different sharding data (the data in the group is redundant, which has been mentioned in high availability)
The application only needs to send query, insert, and other statements to connector. Connector automatically allocates the data to each group through MasterGroup, or combines qualified data from each group to return it to the application.

Disadvantages and restrictions:
The following are two restrictions:

The auto-increment key cannot be used as the sharding key. Transactions and queries can only be in the same shard. The data updated in the transaction cannot be sharded across partitions, and the data returned by the query statement cannot be sharded across partitions.

 

 

Test High Availability

Server Architecture:

Function

IP

Port

Backing store (Save the configuration information of each server)

200.200.168.24

3306

Fabric management process (Connector)

200.200.168.24

32274

HA Group 1 -- Master

200.200.168.23

3306

HA Group 1 -- Slave

200.200.168.25

3306

 

The installation process is omitted. The following describes how to set a high-availability group and add a backup server.

First, create a high-availability group, such as the group name group_id-1, command:

Mysqlfabric group create group_id-1

Add machines 200.200.168.25 and 200.200.168.23 to the group_id-1 in the group:

Mysqlfabric group add group_id-1 200.200.168.25: 3306

Mysqlfabric group add group_id-1 200.200.168.23: 3306

Then, view the machine status in the group:

Because no master server is set, both services are in the SECONDARY status.
Upgrade one of them to the master server:
Mysqlfabric group promote group_id-1 -- slave_id 00f9831f-d602-11e3-b65e-0800271119cb
Then check the status:

The Service set to the master server has become Primary.
In addition, the mode attribute indicates that the server is read/write (READ_WRITE) or read-only (READ_ONLY). Read-only indicates that the pressure on data query can be apportioned. Only the master server can be set to READ_WRITE ).
Check the slave status of the 25 servers:

It can be seen that its master server has been directed to 23


Then enable the Failover function:
Group_id-1 for mysqlfabric group activate
After activation, you can test the high availability of the service.
First, perform the status test:
Stop master server 23

Then check the status:

As you can see, 25 is automatically upgraded to the master server.
However, after 23 is restored, You need to manually reset 23 as the master server.


Real-time test:
Objective: To test how long the backup server displays the data after the primary service updates the data
Test Case: use java code to create a connection and insert 100 records to a table to check how long the backup server can synchronize the 100 data records.
Test results:
The table originally contains 101 pieces of data. After running the program, check the number of data entries on the master server:

Obviously, the master server is updated immediately.

View the number of data entries on the backup server:

However, the backup server waits for 1-2 minutes to complete synchronization. (fabric uses asynchronous replication. This is the default mode and has good performance. The master server does not have to wait for the backup server to return, but the synchronization speed is slow)


There are the following solutions to the stability of Data Synchronization from the server:

Use semi-sync to enhance data consistency: asynchronous replication can provide better performance, but the main database only sends binlog logs to the slave database, and the Action ends. It does not verify whether the slave database has received the logs, high risk. Semi-synchronous replication will be sent to the slave database, and will not be returned until the slave database sends a confirmation message. You can set the update method for synchronizing logs from the database to reduce the synchronization latency and speed up synchronization. Install semi-synchronous replication:
Run in mysql
Install plugin rpl_semi_sync_master soname 'semisync _ master. so ';
Install plugin rpl_semi_sync_slave soname 'semisync _ slave. so ';
Set global rpl_semi_sync_master_enabled = ON;
Set global rpl_semi_sync_slave_enabled = ON;
Modify my. cnf:
Rpl_semi_sync_master_enabled = 1
Rpl_semi_sync_slave_enabled = 1
Sync_relay_log = 1
Sync_relay_log_info = 1
Sync_master_info = 1

Stability test:
Test Case: use java code to create a connection, insert records to a table, and stop the master server during the insertion process to check whether the backup server has the records
Test Results: After the master server is stopped, the java program throws an exception:

 

However, if you send the SQL command again, you can return it successfully. It turns out that the transaction failed at the time. The connection is switched to the backup server and is still available.
I have read the mysql document and read some chapters to illustrate this problem:

It is mentioned that when the master server is on the machine, although our applications do not need to be modified, some transactions will be lost before the master server is replaced by the backup server, these can be handled as normal mysql errors.

Data Integrity Verification:
Test whether the backup server can synchronize all data after the master server is stopped.
After restarting the master server, view the number of records

 

We can see that the system stops inserting 1059 records.

Now let's take a look at the number of records on the backup server to see if all the data can be synchronized after the master server is in service.

 

It took about dozens of seconds to complete the synchronization. Although the data was not synchronized immediately, it was not lost.

1.2. Slice: How to support scalability and Server Load balancer

Fabric partition Introduction: when a machine or a group cannot withstand the service pressure, you can add servers to share the read/write pressure. Fabirc's partition function can distribute data in some tables to different servers. We can set the rules for allocating data storage, and set the rules for allocating shard keys in the table. In addition, data in some tables may not need to be stored in parts. You need to store the entire table on the same server. You can set a Global Group to store the data, data stored in the global group is automatically copied to all other sharding groups.


 

4. Galera Cluster

 

Introduction:

Galera Cluster is the world's most advanced open-source database Cluster solution.

Main advantages and features:

Real multi-master service mode: multiple services can be read and written at the same time. Unlike Fabric, some services can only be used for backup and synchronous replication: no delay replication, no data loss, hot backup: when a server becomes a machine, the slave server automatically takes over and does not generate any nodes automatically extended when the server is added, you do not need to manually copy the database to a new node. the InnoDB engine is transparent to applications: applications do not need to be modified.

 

Architecture and implementation principles:
First, let's look at the traditional architecture Diagram Based on mysql Replication:

Replication means to start the Replication thread to copy the Update log from the master server, and then transfer it to the backup server for execution. This method may cause the risk of lost transactions and untimely synchronization. Fabric and traditional master-slave replication use this method.



Galera adopts the following architecture to ensure the consistency of transactions on all machines:

The client accesses the database through Galera Load Balancer, and each transaction committed is executed on all servers through the wsrep API. Otherwise, all servers are successfully executed, or all servers are rolled back, ensure data consistency of all services, and all servers are updated in real time.


Disadvantages and restrictions:

Because the same transaction needs to be executed on multiple machines in the cluster, network transmission and concurrent execution will consume a certain amount of performance. The same data is stored on all machines for full redundancy. If a machine acts as both the master server and backup server, the probability of rollback will increase due to optimistic locks. Be careful when writing programs. Unsupported SQL: LOCK/UNLOCK TABLES/GET_LOCK (), RELEASE_LOCK ()... XA Transaction not supported
Currently, there are three implementation solutions based on Galera Cluster: Galera Cluster for MySQL, Percona XtraDB Cluster, and MariaDB Galera Cluster.
We adopt Percona XtraDB Cluster, which is mature and has many application cases.
Application case:
More than 2000 foreign enterprises:

 

 

Including:


 

Cluster deployment architecture:

Function

IP

Port

Backing store (Save the configuration information of each server)

200.200.168.24

3306

Fabric management process (Connector)

200.200.168.24

32274

HA Master 1

200.200.168.24

3306

HA Master 2

200.200.168.25

3306

HA Master 3

200.200.168.23

3306

 

4.1 Test Data Synchronization

Create a table on machine 24:

View it in 25 immediately. It can be seen that it has been created synchronously.

 

Use Java code to insert 100 records on 24 servers

View the number of records on 25 servers now

The data synchronization takes effect immediately.

4.2. Test adding a cluster node
The process of adding a Cluster node is simple. As long as Percona XtraDB Cluster is deployed on the newly added machine, and then started, the system will automatically synchronize the data in the existing Cluster to the new machine.

To test the service, stop one of the nodes:

Then, use java code to insert million data into the cluster.

View the database size of million data:

When another node is started, the cluster data is automatically synchronized at startup:

It takes about 20 seconds to start. Check that the data size is consistent and the number of table records is also synchronized.

 

5. comparison summary

 

 

 

MySQL Fabric

Galera Cluster

Use Cases

It was launched only in May 2014. Currently, no application cases of large companies are found on the Internet.

The solution is mature and used by many foreign Internet companies

Real-time Data Backup

Due to the use of asynchronous replication, the delay is generally several dozen seconds, but data will not be lost.

Real-time synchronization without data loss

Data redundancy

Using sharding, you can set sharding key rules to distribute different data in the same table across multiple machines.

Each node is fully redundant and has no partitions.

High Availability

Fabric Connector Ctor is used to automatically switch the master server to the slave server. However, due to backup delay, data cannot be queried immediately after switchover.

Use HAProxy. Because of real-time synchronization, the switchover availability is higher.

Scalability

After adding a node, You need to manually copy the cluster data

It is very convenient to expand the node. Cluster data is automatically synchronized when the node is started. 100 million data (MB) takes about 20 seconds.

Server Load balancer

Implementation through HASharding

Load Balancing using HAProxy

Program modification

Switch to jdbc: mysql: fabric jdbc class and url

The program does not need to be modified.

Performance Comparison

Use java to insert 100 records directly with jdbc, about 2000 + ms

Like directly operating mysql, 100 records are inserted directly using jdbc, about 600 ms.

6. Practical Application

Considering the advantages and disadvantages of the above solution, we prefer to select Galera. If there are only two database servers, consider using the following database architecture to achieve high availability, load balancing, and dynamic expansion:


 

If you have three machines, consider:

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.