Galera Cluster mysql+keepalived Cluster deployment

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Uninstall MySQL

Find native-installed MySQL
Rpm-qa | Grep-i MySQL

--nodeps--force

Rpm-ev mysql-server-5.6.15-1.el6.x86_64
Rpm-ev--nodeps mysql-libs-5.1.71-1.el6.x86_64 (forced unload)

Find previous versions of MySQL directories, and delete old versions of MySQL files and libraries

Find/-name MySQL

Rm-rf/usr/lib64/mysql

/ETC/MY.CNF will not be removed after uninstallation, manual removal is required
Rm-rf/etc/my.cnf

2. Install MySQL
RPM-IVH libaio-0.3.107-10.el6.x86_64.rpm
RPM-IVH mysql-server-5.6.38-1.el6.x86_64.rpm
RPM-IVH mysql-client-5.6.38-1.el6.x86_64.rpm
RPM-IVH mysql-shared-5.6.38-1.el6.x86_64.rpm
RPM-IVH mysql-shared-compat-5.6.38-1.el6.x86_64.rpm

Configuring SELinux
Gedit/etc/sysconfig/selinux
Selinux=permissive

Shell Terminal execution
Setenforce Permissive

Open port
/sbin/iptables-i input-p TCP--dport 3306-j ACCEPT
/sbin/iptables-i input-p TCP--dport 4444-j ACCEPT
/sbin/iptables-i input-p TCP--dport 4567-j ACCEPT
/sbin/iptables-i input-p TCP--dport 4568-j ACCEPT

/etc/rc.d/init.d/iptables Save

/etc/init.d/iptables status

3. Install Galera patch

Rpm-qa|grep Galera (Fuzzy Lookup installer)
RPM-E--nodeps openssl-1.0.1e-42.el6.x86_64 (uninstall)

RPM-IVH openssl-1.0.1e-57.el6.x86_64.rpm

RPM-IVH lsof-4.82-5.el6.x86_64.rpm--force
RPM-IVH boost-program-options-1.41.0-28.el6.x86_64.rpm--force
RPM-IVH rsync-3.0.6-12.el6.x86_64.rpm--force

RPM-IVH galera-3-25.3.22-2.el6.x86_64.rpm
RPM-IVH mysql-wsrep-server-5.6-5.6.38-25.21.el6.x86_64.rpm--force
RPM-IVH mysql-wsrep-client-5.6-5.6.38-25.21.el6.x86_64.rpm--force
RPM-IVH mysql-wsrep-5.6-5.6.38-25.21.el6.x86_64.rpm
RPM-IVH mysql-wsrep-libs-compat-5.6-5.6.38-25.21.el6.x86_64.rpm
RPM-IVH mysql-wsrep-shared-5.6-5.6.38-25.21.el6.x86_64.rpm--force

Configure/ETC/MY.CNF
[Mysqld]
server_id = 1
Datadir=/var/lib/mysql
Socket=/var/lib/mysql/mysql.sock
port=3306
User=mysql
Binlog_format=row
Log_bin =/var/lib/mysql/logs/mysql-bin
Log_slave_updates=1
#expire_logs_days =7
max_binlog_size=64m
Log_bin_trust_function_creators=1
bind-address=0.0.0.0
Default_storage_engine=innodb
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size=2048m
innodb_log_buffer_size=16m
innodb_log_file_size=128m
Innodb_log_files_in_group=3
Innodb_max_dirty_pages_pct=90
innodb_lock_wait_timeout=120
Innodb_use_sys_malloc=1
Lower_case_table_names=1
Explicit_defaults_for_timestamp=true
Character-set-server=utf8
Sql_mode=no_engine_substitution,strict_trans_tables
Symbolic-links=0
Wsrep_provider=/usr/lib64/galera-3/libgalera_smm.so
Wsrep_provider_options= "Gcache.dir=/var/lib/mysql/galera/;gcache.name=galera.cache;gcache.size=3g;gcache.page_ SIZE=3G "
Wsrep_cluster_name= "Mysql_cluster"
Wsrep_cluster_address= "gcomm://192.168.40.20,192.168.40.21,192.168.40.22"
Wsrep_sst_method=rsync
Wsrep_node_name= "Node1"
Wsrep_node_address= "192.168.40.20"
Wsrep_slave_threads=4
Wsrep_log_conflicts=on
max_connections=200
Log-error=/var/lib/mysql/mysqld.log
Skip-name-resolve
Local-infile=1
[Mysql_safe]
!includedir/etc/my.cnf.d/
#sql_mode =no_engine_substitution,strict_trans_tables

#########################################################

4. Start the cluster
After all nodes have been configured, the master node executes

Mkdir/var/lib/mysql/logs
Mkdir/var/lib/mysql/galera

Chown-r Mysql.mysql/var/lib/mysql/logs
Chown-r Mysql.mysql/var/lib/mysql/galera

Service MySQL start--wsrep-new-cluster

5, check the MySQL default password, and login to change the password

Cat/root/.mysql_secret

Mysql-uroot-p default Password

mysql> SET PASSWORD = PASSWORD (' root ');

After execution, log in to MySQL and execute

Mysql>show Global status like ' Wsrep_cluster_size ';
Mysql>show Global status like ' Wsrep_ready ';

In normal words, return size to 1,wsrep_ready on.

Other node execution
Mysql>service MySQL Start

Normal returns a size of 3.

6, the installation of keepalived

Rpm-qa|grep LIBNL (Fuzzy Lookup installer)

RPM-IVH keepalived-1.2.13-5.el6_6.x86_64.rpm

RPM-IVH net-snmp-libs-5.5-60.el6.x86_64.rpm
RPM-IVH lm_sensors-3.1.1-17.el6.x86_64.rpm
RPM-IVH libnl-1.1.4-2.el6.x86_64.rpm

7. Configuring the Keepalived File

Gedit/etc/keepalived/keepalived.conf

##########################################
! Configuration File for keepalived
global_defs {
router_id lvs_devel
}
vrrp_script chk {
Script "/etc/ Keepalived/checkmysql.sh "
Interval 2
weight-50
}
Vrrp_script chk2 {
Script"/etc/keepalived/icmp.sh
Interval
weight-50
}
vrrp_instance vi_1 {
State MASTER #其他两台配置为BACKUP
Interface eth0
Virtu AL_ROUTER_ID
Priority #其他两台的优先级分别为140,
Advert_int 1
Authentication {
Auth_type PASS
Auth_pa SS 1111
}
Track_script {
chk
Chk2
}
virtual_ipaddress {
192.168.40.51/24
}
Notify_ Master "/root/master.sh"
}
#####################################################
Create user
Mysql> Create user ' keepalived ' @ ' localhost ';

Configuring the Checkmysql.sh File
gedit/etc/keepalived/checkmysql.sh
##########################################
#!/bin/sh
Mysql-ukeepalived-e "Select Variable_value from INFORMATION_SCHEMA. Global_status where variable_name= ' Wsrep_ready ' \g "|grep-c" on "1>/dev/null 2&>1
If [$?-ne 0];then
Service keepalived stop;
#curl perform POST request, notify alarm platform MySQL exception, implement definition
Exit 1;
Fi
Exit 0;
###############################################

Configuring the Master.sh File
gedit/etc/keepalived/master.sh
##########################################
#!/bin/sh
#curl Notification Alarm Platform virtual address complete switch
###############################################

Configuring the Icmp.sh File
gedit/etc/keepalived/icmp.sh
##########################################
#!/bin/sh
host= ' 192.168.40.254 ';
temp= '/etc/keepalived/ping.temp ';
If [-f ${temp}];then
Rm-f ${temp};
Fi
Ping ${host}-C 5-w 1 > ${temp};
Loss= ' grep-i "packet Loss" ${temp} |grep-o-E ' [0-9]\{0,3\}% ' |sed ' s/%//g ';
If [${loss}-gt];then
Exit 1;
Fi
Exit 0;
###############################################

8. Import the production environment data structure
Galera cluster for MySQL and keepalived are both deployed and completed, the prepared production environment table structure Struct.sql imported into the new database.
Mysql>source Struct.sql
After the import succeeds, execute the SQL script to check whether the database structure meets the following requirements:
? The storage engine is InnoDB;
? All tables contain primary keys;
? You cannot have a full-text index and a spatial index.
SQL script:
SELECT DISTINCT
CONCAT (T.table_schema, '. ', T.table_name) as TBL,
T.engine,
IF (ISNULL (c.constraint_name), ' nopk ', ') as NOPK,
IF (s.index_type = ' fulltext ', ' Fulltext ', ') as FTIDX,
IF (S.index_type = ' spatial ', ' spatial ', ') as Gisidx
From Information_schema.tables as T
Left JOIN Information_schema.key_column_usage as C
On (T.table_schema = C.constraint_schema and t.table_name = C.table_name
and c.constraint_name = ' PRIMARY ')
Left JOIN Information_schema.statistics as S
On (T.table_schema = S.table_schema and t.table_name = S.table_name
and S.index_type in (' Fulltext ', ' SPATIAL '))
WHERE T.table_schema not in (' Information_schema ', ' performance_schema ', ' MySQL ')
and T.table_type = ' BASE table '
and (T.engine <> ' InnoDB ' or c.constraint_name are NULL OR s.index_type in (' Fulltext ', ' SPATIAL '))
ORDER by T.table_schema,t.table_name;
The above script lists the tables that do not meet the requirements and modifies them according to the results.

9, Galera cluster maintenance
Restart Cluster
1. Normal restart of cluster
Before restarting the entire cluster, it is a good idea to stop the keepalived service so that external applications cannot continue to access MySQL
#service keepalived Stop
The benefit of this is to ensure that the data in the cluster is consistent before and after the restart, that there are no differences, and that it can take a lot of time to avoid data synchronization when restarting.
After you close the keepalived service, execute it on all servers in turn
#service MySQL Stop
After you turn off MySQL, check the #vi/var/lib/mysql/grastate.dat file on the three servers, which contains the cluster UUID and sequence number. We need to find the node with the largest parameter seqno value.
In the case where all servers are normally shut down, we see that the Grastate.dat file should be similar to the following content
# Galera saved state
version:2.1
uuid:d7b6d215-d935-11e5-bd08-36ff08c40499
seqno:361475525
Cert_index:
We need to start the node with the largest seqno value as the first node in the cluster
#service MySQL start--wsrep-new-cluster
#service keepalived Start
Other node execution
#service MySQL start #service keepalived start
2. Cluster exception restart
The steps to restart the cluster are slightly more complicated if you encounter abnormal conditions such as breakpoints in the machine room that cause all nodes to crashed at the same time.
In this case, the #vi/var/lib/mysql/grastate.dat file for all nodes in the cluster is typically similar to the following
# Galera saved state
version:2.1
uuid:d7b6d215-d935-11e5-bd08-36ff08c40499
Seqno:-1
Cert_index:
The seqno value in the file is-1, at this point, for the sake of insurance, we first set a node as a normal MySQL service startup, will be my.cnf in the file about Wsrep content comments, and then execute
#service MySQL Start
#service keepalived Start
Resumes all logging operations after execution, and then restores the binary log records
#mysqlbinlog--server-id=## mysql-bin.000### |mysql-uroot-p
Server-id the server_id,mysql-bin.000## #为二进制日志文件 in the configuration in the My.cnf file.
After the node data recovery is successful, restore the configuration item my.cnf about Wsrep, and then start the node as the first node in the cluster starts
#service MySQL start--wsrep-new-cluster
#service keepalived Start
Other node execution
#service MySQL Start
#service keepalived Start

(iii) deleting and adding nodes
There may be a server hardware failure, need to move it out of the room, it is necessary to replace the removed device with a new device.
In this case, we only need to reinstall the Galera cluster for MySQL deployment process on the new device after the faulty device has been removed, including installing MySQL, installing the Galera patch, and installing the keepalived service.
At this time there are two cases, one is the new device IP address is unchanged, the other is the new device IP address changes.
For the same IP address, we only need to execute the following statement after the configuration is complete to allow the new device to join the cluster and automatically complete the data synchronization.
#service MySQL Start
#service keepalived Start
For cases where the IP address changes, you first need to modify several configurations of the MYSQL.CNF.
Install the deployment diagram, assuming the Node1 node 192.168.40.20 failure, the new device IP is 192.168.40.24 the new MY.CNF configuration file The following parameter values are:
Wsrep_cluster_address= "gcomm://192.168.40.21,192.168.40.22,192.168.40.24"
Wsrep_node_name= "Node1"
Wsrep_node_address= "192.168.40.24"
server_id = 1
The mysql.cnf file wsrep_cluster_address parameters of other nodes Node2 and Node3 also need to be modified according to the new cluster environment
#wsrep_cluster_address = "gcomm://192.168.40.21,192.168.40.22,192.168.40.24"
Of course, Node2 and node3 modify the configuration and do not immediately take effect, we log in Node2 and Node3 respectively, and then perform
Mysql>set global wsrep_cluster_address= ' gcomm://192.168.40.21,192.168.40.22,192.168.40.24 ';
The new Node1 node is then
#service MySQL Start
#service keepalived Start
Node1 will automatically join the cluster and complete data synchronization.
About Unknown Command Errors
When a cluster network fails, a unknown command error may occur when query is executed. However, if the node failure, we can be timely alarm and processing, we can effectively avoid this error.
Of course, if this error occurs, it can be handled very quickly.
Log on to the database and execute on the node
SHOW STATUS like ' wsrep_last_committed ';
The query returns the value of last committed and executes on the node with the highest return value.
SET GLOBAL wsrep_provider_options= ' Pc.bootstrap=yes ';
After execution, the other nodes will automatically join and synchronize the data.

10. Backup and Recovery
First, backup
Backups of the database are backed up using binary logs, and you can periodically back up the binary log files to the path of the specified server.
In our my.cnf configuration file, there are several log options
Binlog_format=row
Log_bin =/var/lib/mysql/logs/mysql-bin
log-bin-index=/var/lib/mysql/logs/mysql-bin.index
Log_slave_updates=1
max_binlog_size=64m
The binary log format is row, the save path is the #vi/var/lib/mysql/logs directory, the log begins with Mysql-bin, The index file is Mysql-bin.index, and the slave operation log is saved, we set the log file up to 64M to prevent the log file from being too large.
In this way, we can save the data operation log, and in the event of an unexpected situation, we can restore the database based on the database log we backed up.
For security reasons, we can make an incremental backup of the binary log stored in #vi/var/lib/mysql/logs to a local directory or to a remote NFS server.
For example, if the locally-mounted NFS path is/MNT/NFS, we can synchronize/var/lib/mysql/logs to/mnt/nfs in the following way:
#rsync-avzp/var/lib/mysql/logs/mnt/nfs
Configure this command as a scheduled task, synchronized every 5 minutes
*/5****/usr/bin/rsync-avzp/var/lib/mysql/logs/mnt/nfs 1>/dev/null 2&>1
For binary logs, it is important to note that the following statements are not easily executed:

Mysql>reset master;
This command empties and resets all generated log records. Do not perform a full backup unless you have done so.
Second, recovery
In the event of an unexpected situation, we may need to recover the database system based on binary log records.
The binary log file format is row, so if we need to view the log file, we can execute the following command
Mysqlbinlog--base64-output=decode-row-v mysql-bin.00000#
When using the binary log for recovery, please shut down the other nodes in the MySQL cluster, start the database as a normal standalone database, comment out the configuration item about Wsrep in My.cnf, and start the database first.
#service MySQL Start
After startup, we recover from start dataetime, stop datetime, or start position, stop position, which need to be recovered as found
For example, to restore a record after the mysql-bin.00003 start position is 25845, execute
Mysqlbinlog--server-id=2-j 25845 mysql-bin.00003 |mysql-uroot-p
After recovering the data, restore the configuration item wsrep the configuration file, and then start the node as the first node in the cluster
#service MySQL start--wsrep-new-cluster
#service keepalived Start
Other node execution
#service MySQL Start
#service keepalived Start

MySQL Galera Monitoring
View MySQL version: mysql> SHOW GLOBAL VARIABLES like ' version ';
View Wsrep version: Mysql> SHOW GLOBAL STATUS like ' wsrep_provider_version ';
See all variables related to Wsrep: mysql> SHOW VARIABLES like ' wsrep% ' \g
View Galera cluster Status: mysql> Show status like ' wsrep% ';

Monitoring Status Parameter Description:
Cluster integrity check:
Wsrep_cluster_state_uuid: The values for all nodes in the cluster should be the same, with different values for the nodes, indicating that they are not connected to the cluster.
WSREP_CLUSTER_CONF_ID: Normally, this value is the same on all nodes. If the value is different, the node is temporarily "partitioned". The same value should be restored when the network connection between nodes is restored.
Wsrep_cluster_size: If this value is consistent with the expected number of nodes, all cluster nodes are connected.
Wsrep_cluster_status: The state of the cluster. If it is not "Primary", a "Partition" or "Split-brain" condition appears.
Node status check:
Wsrep_ready: This value is on, which indicates that the SQL payload can be accepted. If off, you need to check for wsrep_connected.
Wsrep_connected: If the value is off and the value of Wsrep_ready is off, then the node is not connected to the cluster. (This may be caused by a misconfiguration such as wsrep_cluster_address or Wsrep_cluster_name.) The error log is required for specific errors
Wsrep_local_state_comment: If wsrep_connected is on, but Wsrep_ready is off, you can see the reason from that item.
Replication Health Check:
Wsrep_flow_control_paused: Indicates how long replication has stopped. That is, the extent to which the cluster is slow due to slave latency. A value of 0~1, the closer to 0, the better, and a value of 1 means that replication stops completely. Optimize the value of wsrep_slave_threads to improve .
Wsrep_cert_deps_distance: How many transactions can be processed in parallel. The value set by wsrep_slave_threads should not be higher than the value.
Wsrep_flow_control_sent: Indicates how many times the node has stopped replicating.
Wsrep_local_recv_queue_avg: Represents the average length of the slave transaction queue. Slave the bottleneck.

The wsrep_flow_control_sent and wsrep_local_recv_queue_avg of the slowest nodes are the highest. These two values are relatively better if they are lower.
To detect slow network problems:
Wsrep_local_send_queue_avg: A harbinger of network bottlenecks. If this value is higher, there may be a network bottle
Number of conflicts or deadlocks:
wsrep_last_committed: Number of last committed transactions
Wsrep_local_cert_failures and Wsrep_local_bf_aborts: Rollback, number of conflicts detected

11. Supplement
Mysqlbinlog--server-id=1/var/lib/mysql/logs/mysql-bin.000010 |mysql-uroot-proot

IP add LS dev eth5

GRANT all privileges on * * to ' root ' @ ' 192.168.40.51 ' identified by ' root ' with GRANT OPTION;

Grant all privileges on * * to [e-mail protected] "%" identified by "."; (Open database Operations permissions)

Galera Cluster mysql+keepalived Cluster deployment

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Galera Cluster mysql+keepalived Cluster deployment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Galera Cluster mysql+keepalived Cluster deployment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support