Youku Tudou Senior Engineer: MySQL high-availability maxscale and MHA

Last Update:2016-10-16 Source: Internet

Author: User

Tags failover iptables ssh port

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article is based on the Dbaplus community's 67th phase of the online sharing

This share mainly includes the following content:

1. mysql High-availability Program

2. Why Choose MHA

3, read-write separation scheme and why choose Maxscale

One, MySQL failover program

Common Failover Scenarios

MMM

Mmm Cons:

The monitor node is a single point that can be combined with keepalived to achieve a highly available solution for the current MySQL Failover
Keepalived will have a risk of splitting the brain.
Data may be lost in read-write busy business
MHA + ssh-o Test heartbeat + mastermha_secondary_check (two Tests)

MHA

MHA Advantages:

Save binary log events from the master of the outage crash (binlogevent)
Identify slave that contain the latest updates
Apply differences to the trunk log (relaylog) to other slave
Apply binary log events saved from master
Raise a Slave for the new master
Make other slave connect to the new master for replication
MariaDB Replication Manager (MRM)

Supports only mariadb with GTID based replication topologies

Second, MHA

Today, the main MHA. MHA can be said to be a strong and consistent master-slave switching tool, and the switching interval is less than 30s, ideal for online use.

Specific principles

Save binary log events from the master of the outage crash (binlogevent)
Identify slave that contain the latest updates
Apply differences to the trunk log (relaylog) to other slave
Apply binary log events saved from master
Raise a Slave for the new master
Make other slave connect to the new master for replication

MHA composition

RPM-QL Mha4mysql-manager-0.56-0.el6.noarch

1. Management node

2. Data node

3. mysql Configuration essentials

Installation Configuration MHA

1) MySQL master/slave

MySQL one master two from (a candidate_master)

Master

Slave

MySQL Master-Slave construction (one main two from)

1) MySQL master-slave configuration

Create user

Backup

MySQLdump--master-data=2--single-transaction-a > Bk.sql (we are not allowed to use functions and stored procedures on production)

tips: If you are not creating a new db, we recommend that you use Mydumper (slave) or Innobackupex (master) backup

Execute from Library

A. Recovering Bk.sql from the library

B. Implementation

C. If Gtid is turned on

Tips:

1. Reduce the risk of data loss

Innodb_flush_log_at_trx_commit=1

Innodb_support_xa=1

Sync_binlog =1

Gtid

Semi-synchronous replication or 5.7 enhanced semi-synchronous

Binlog_format to RBR

2, the master-slave consistent detection

Pt-table-checksum

Pt-table-sync

Pt-online-schema-change Although 5.6 supports DDL online, I recommend using Pt-osc, but note: If Binlog_format is SBR, the risk of primary key conflicts with Pt-osc

4. MHA Configuration

1) SSH Configuration

Ansible to do

2) Install MHA

Yum install-y--nogpgcheck mha4mysql-* (already downloaded) executes at each node

3) Edit the file

4) Clean relay log

Slave on the Relay_log_purge is off, in MHA environment, failover, will use relay log to compare the difference log, send the difference log to other slave, for playback

5, MHA Environmental monitoring

Check the MHA environment

Start MHA Manager

6. MHA Switch Test

1) Simulation Example Cresh

/etc/init.d/mysql stop

2) Analog host Cresh

echo a >/proc/sysrq-trigger

3) Use Iptables

Iptables-a input-p tcp-m iprange--src-range 192.168.10.1-192.168.10.241--d Port 3306-j DROP

PS. Can run Sysbench in the master node and do the above test during the pressure measurement

Tips:

MHA default does not have arping, this to add themselves, or the server will automatically wait until the VIP cache expires, the VIP will have a certain time is not available
Masterha_manager add--ignore_last_failover to the command line otherwise the switchover will fail unless the App1.failover.complete file is deleted
VIP We do not use KeepAlive, is on the two host to plug the network cable, such as eth1, the VIP added to the master of the Eth1
To activate the corresponding NIC of the master
Report_script=/usr/bin/send_report e-mail to add the function
Slave do not delay too long, the longer the delay, the longer the switchover
Secondary_check_script=/usr/local/bin/masterha_secondary_check-s 192.168.10.100-s 192.168.10.101--user=root-- master_host=maven119--master_ip=192.168.10.88--master_port=3306 So only if two managers are pinging the generalist will switch to prevent inconsistent data (note the changes Masterha_ Secondary_check SSH port number, he was written dead 22)
Grep-i ' Change master ' Manager.log can find the changes master to statement that can be executed on the old Main library after switching
Set Ignore_fail = 1 This will toggle even if the slave has errors
Set SSH timeout to prevent SSH connection from slowing down, MHA switch occurs

7, MHA Manager management Multi-instance

This completes the deployment of multiple instances.

Tips:

If you feel MHA deployment trouble, but also write your own script, you can use Mha_helper

Web:https://github.com/ovaistariq/mha-helper

Sql-aware Load Balancer:

MySQL Proxy: The official does not maintain the
MySQL Router: Official maintenance, relatively simple
Maxscale: Plug-in, customized and flexible, automatic detection of MySQL master failure
Amoeba: Supports SQL filtering, read/write separation, sharding, MySQL Failover not supported
Cobar: Supports sub-libraries, does not support sub-tables
Mycat: Two-time development based on Cobar
TDDL (Taobao distributed Data Layer): Ali self-developed middleware for read and write separation based on client mode

Third, Maxscale

What you want to introduce here is Maxscale.

What are the advantages of maxscale, a word, above these middleware has some advantages, it basically has.

Read and write separation with weights (load balancing)
SQL firewall
Multiple routing strategies (Connection based, Statement Based,schema based)
Automatic detection of MySQL master Failover (with MHA or MRM)
Detection of master-slave delay
Multi-tenant sharding architecture

Schema comparison

The DB architecture of most Internet companies

Hidden trouble: The general Internet company will use MHA do failover, and then use LVS to do load balancing on the read library, but LVs go TCP protocol, when the read library hangs, LVS will not kick it off, in addition to the long connection LVS application support is not good, Because the length of the LVs check is generally at 30s, but the setting of long connection is generally 30 minutes, or do not set timwout, so that when the business side disconnects, LVS still think it will be dead and alive, so the thread connected to the DB is not reduced. Cause thead_connected to be full, MySQL is not available.

DB layer Architecture using Maxscale

Avoids the problem of long link timeouts when using LVS.

Maxscale configuration is simple

Yum-y Install Maxscale (only performed on Maxscale)
CP maxscale_template.cnf MAXSCALE.CNF
Generate Password:
maxkeys/var/lib/maxscale/
maxpasswd/var/lib/maxscale/maven119
Modifying a configuration file
Need to find me alone, too long configuration file ...

View status by managing commands

You can see that there are currently two db, as well as the operational status

See what services are turned on read-write separation and client

Here's a statement test:

MySQL Master node:

4 rows in Set (0.00 sec)

MySQL slave node, add one more record.

Found read hit in the library.

If you want to read the main library, you can put the SELECT statement into the transaction.

Specific reading and writing conditions can be observed using General_log.

On the master node, execute:

Set global general_log=1;

On Maxscale node Execution:

Found write hit on the main library.

Tips:

If you want to read on master
You can put a SELECT statement in a transaction begin;select;commit
Maxscale will do health checks on every slave, the same principle as pt-heartbeat. The main library inserts timestamps, to slave vs. SEREVR time.
Gnoring Secrets file/var/lib/maxscale/.secrets, Invalid permissions. Secrets permissions are not chown Maxscale:maxscale. Secrets
Maxscale 1.4 has made a lot of improvements.

Important Concept DCB

It can be seen from this diagram that the importance of DCB, callback finally came to the Dcb.h

So what is DCB?

A DCB (descriptor Control Block) represents the state of a connection within the Maxscale, each connection from the client, the connection to the backend server, and each listener is assigned a DCB, The status statistics for these connections are done by these DCB. Each DCB does not have a specific name for the query, but instead uses a unique memory address directly.

Maxscale's MHA

The official recommendation is to use LSYNCD or Corosync-pacemaker.

Personally think that some of Maxscale's ideas are good, including Percona also generate Maxscale is currently the best read and write separation middleware. is not very mature, small projects can try. Large projects are also recommended TDDL this production-oriented middleware.

Maxscale and MHA Integration

Maxscale and MHA integration is actually very simple, general MHA will let development use VIP. When Master is down, slave takes over and is transparent to the front end.

When combined with Maxscale, the maxscale.conf file does not need to change anything, it only needs to be deployed on the back-end MHA. Because Maxscale can monitor MySQL master and slave changes.

Summary: Maxscale and MHA integration, only need to install MHA can.

Written in the end, there is little need to change any configuration on the existing MySQL master-slave environment on MHA and Maxscale. You only need to change the configuration in the development framework to overwrite the original IP and port with the IP and port of the Maxscale server.

Q&a

Q1: excuse me, this 10.10.111.1 is the physical IP of deploying Maxscale server, does the server deploying Maxscale need two servers to do ha? In the case of a single server would it not be possible to crash the entire application? Or do I get the wrong idea?

A1: It is officially recommended to use LSYNCD or corosync-pacemaker to do Maxscale ha.

Q2: is the monitoring system self-developed or open source? What are the monitoring metrics to monitor performance and stability?

A2:pt-heartbeat to monitor the master-slave status in real time, pt-heartbeat can be one second.

Q3: always do not understand very good things why not, to the master-slave between Che cut away?

A3: Maybe the scene is different, we generally will have 4 db to do master-slave, mainly need to expand Read Library. Youku is basically reading more than writing.

Q4:slave-skip-errors = 1062,1032,1060 Do you use this kind of configuration?

A4: use. But 1062,1032 these two can't match.

Good book both hands

Leave a sympathetic insight in the comment area of this article,

Source: http://www.cnblogs.com/zengkefu/p/5720831.html

Youku Tudou Senior Engineer: MySQL high-availability maxscale and MHA

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More