Mysql+drbd+heartbeat

Source: Internet
Author: User
Tags failover prepare



Mysql+drbd+heartbeat

                     Overview: DRBD One, DRBD introduction DRBD is the full name of: Distributed replicatedblock device (DRBD) distributed block device replication, DRBD is composed of kernel modules and related scripts, To build a high-availability cluster. The way to do this is to mirror the entire device over the network. You can think of it as a network raid. It allows the user to create a real-time image of a local block device on a remote machine. Second, the DRBD working principle (drbd primary) is responsible for receiving data, writes the data to the local disk and sends it to another host (drbd secondary). The other host then saves the data to its own disk. Currently, DRBD only allows read and write access to one node at a time, but this is sufficient for the usual failover of a highly available cluster. It is possible that later versions will support two nodes for read and write access. Iii. the relationship between DRBD and ha a DRBD system consists of two nodes, similar to Ha clusters, as well as primary and standby nodes, where applications and operating systems can run and access the DRBD device (/dev/drbd*) on nodes with primary devices. The data written by the master node is stored on the primary node's disk device through the DRBD device, and the data is automatically sent to the backup node's DRBD device, which eventually writes to the standby node on the disk device, and on the standby node, DRBD simply writes data from the DRBD device to the disk on the standby node. Most high-availability clusters now use shared storage, and DRBD can act as a shared storage device, using DRBD that does not require much hardware to invest. Because it runs in a TCP/IP network, using DRBD as a shared storage device saves a lot of cost because the price is much cheaper than a dedicated storage network, and its performance and stability are good. Four, DRBD replication mode protocol A: Asynchronous replication protocol. Once the local disk write has completed and the packet is already in the Send queue, the write is considered complete. In the event of a node failure, data loss can occur because the data that is written to the remote node may still be in the sending queue. Although the data on the failover node is consistent, it is not updated in a timely manner. This is typically used for geographically separate nodes   protocol B: Memory synchronous (semi-synchronous) replication protocols. Once the local disk write is completed and the replication packet reaches the peer node, it is considered to be written on the master node as completed. Data loss can occurIn the case of simultaneous failure of the participating two nodes, the data in transit may not be committed to disk   protocol C: Synchronous replication protocol. Write is considered complete only if the disk on the local and remote nodes has confirmed that the write operation is complete. There is no data loss, so this is the popular mode of a cluster node, but I / o throughput relies on network bandwidth generally using protocol C, but choosing the C protocol will affect traffic and thus affect network latency. For data reliability, we need to be cautious about which protocol to use when using the production environment heartbeat one, Heartbeat introduction heartbeat is a component of linux-ha engineering, and since 1999, many editions have been released, is currently the open source Linux-ha project The most successful example, in the industry has been widely used, the analysis is released January 18, 2007 version 2.0.8. With the growing use of Linux in key industries, it will certainly provide services from large commercial companies such as IBM and Sun, a key feature of the services offered by these business companies, which are highly available clusters. Second, heartbeat work principle heartbeat the most core includes two parts, the heartbeat monitoring part and the resource takeover part, heartbeat monitoring can be carried out through the network link and the serial port, and support redundant links, they send each other a newspaper Greek tell each other their current state, If the message is not received by the other party within the specified time, then it is considered to be invalid, then the resource takeover module should be started to take over the resources or services running on the other host.   High availability cluster high availability cluster refers to a set of independent computers that are connected by hardware and software that behave as a single system in front of the user, which stops working on one or more nodes within such a set of computer systems, and the service switches from the failed node to the normal working node. does not cause a service outage. As you can see from this definition, the cluster must detect when nodes and services are invalidated and when they are available. This task is usually done by a set of code called "Heartbeat". In linux-ha This function is done by a program called Heartbeat.   Environment Description: Operating system     IP address      hostname      Package list      centos6.6-x86_64    192.168.200.101    server-1      DRBD, Heartbeat, mysql    centos6.6-x86_64    192.168.200.102     SERVER-2    DRBD, Heartbeat, mysql      Configuration process: Prepare the configuration before installation: Configure all machines: [[Email protected] ~]# fdisk /dev/sdb (Master and slave configuration, no formatting required) command  (m  FOR HELP): ncommand action   e   extended    p   primary partition  (1-4) ppartition number  (1-4): 1Last  cylinder, +cylinders or +size{k,m,g}  (1-2610, default 2610):  +10GCommand   (M FOR HELP): w[[email protected] ~]# partprobe /dev/sdb[[email  protected] ~]# vim /etc/sysconfig/network2 hostname=server-1[[email protected] ~ ]# hostname server-1 (change from host to server-2) [[email protected] ~]# bash[[email protected]  ~]# vim /etc/hoSts3 192.168.0.200   server-14 192.168.0.201   server-2 [[email  protected] ~]# service iptables stop[[email protected] ~]# setenforce  0heartbeat installation: Both master and slave need to be installed [[email protected] ~]# yum -y install perl-timedate  cluster-glue-libs kernel-develkernel-headers flex[[email protected] ~]# rpm  -ivh cluster-glue-1.0.5-6.el6.x86_64.rpm[[email protected] ~]# yum -y  Install heartbeat installation configuration DRBD: Both master and slave need to be installed [[email protected] ~]# wget http://oss.linbit.com/ Drbd/8.4/drbd-8.4.3.tar.gz[[email protected] ~]# tar xf drbd-8.4.3.tar.gz[[email  protected] ~]# cd drbd-8.4.3[[email protected]]#./configure --prefix=/usr/local /drbd--with-km --with-heartbeat[[email protected]]# makekdir=/usr/src/kernels/2.6.32-642. El6.x86_64/ && make && makeinstall[[email protected]]# mkdir -p /usr/local/drbd/var/run/ drbd[[email protected]]# cp /usr/local/drbd/etc/rc.d/init.d/drbd/etc/rc.d/init.d/[[email  protected]]# chkconfig --add drbd[[email protected]]# cd drbd[[email  protected] drbd]# make clean[[email protected] drbd]# make kdir=/usr/src/ kernels/2.6.32-642.el6.x86_64/[[email protected] drbd]# cp drbd.ko /lib/modules/ 2.6.32-642.el6.x86_64/kernel/lib/[[email protected] drbd]# depmod[[email protected]  Drbd]# cp -r /usr/local/drbd/etc/ha.d/resource.d/*/etc/ha.d/resource.d/[[email protected]  drbd]# cd /usr/local/drbd/etc/drbd.d/[[email protected] drbd]# cat /usr/ LOCAL/DRBD/ETC/DRBD.CONF # YOU CAN FIND AN EXAMPLEIN /USR/SHARE/DOC/DRBD ... /drbd.conf.example includE "drbd.d/global_common.conf";     include "Drbd.d/*.res";                //all files ending in. Res in this directory are resource file   configuration Global_ common.conf file (master-slave) [[email protected] drbd.d]# vim global_common.confglobal {     usage-count yes;        //If the usage information is counted, the default is yes} common {startup {    wfc-timeout 120;        //wait for the connection timeout time     degr-wfc-timeout 120;} disk {    on-io-error detach;    //action to perform when an IO error occurs}net {     protocol C;             //Replication mode for the 3rd}}  configuration resource file (master-slave) [[Email protected] drbd.d]# vim r0.resresource r0  {            &nbSP;            //R0 Resource Name          on server-1 {                   device      /dev/drbd0;           //Logical Device Path                       disk  /dev/sdb1;                  //Physical Equipment                    address     192.168.0.200:7788;   //main node                    meta-disk internal;        &nBSP;}         on server-2 {                   device       /dev/drbd0;                   disk  /dev/sdb1;                   address    192.168.0.201:7788;       //node                    meta-disk internal;        }}  Create Meta data (operations on two nodes) [[Email protected]]#  modprobedrbd[[email protected] drbd.d]# dd  if=/dev/zero bs=1m count=1 of=/dev/sdb1[[email protected] drbd.d]# drbdadm  create-md r0new drbd meta data blocksuccessfully created.  start DRBD (master and slave nodes are executed) [[email  protected] drbd.d]# /etc/init.d/drbd startstarting drbd resources: [      create res: r0   prepare disk: r0     ADJUST DISK: R0     ADJUST NET: R0] ... [[email protected] drbd.d]# netstat -anpt | grep 7788tcp         0     0 192.168.0.200:35654         192.168.0.201:7788          established -                    tcp        0     0  192.168.0.200:7788         192.168.0.201:33034        established -      manually verify Master-slave switchover: Initialize the network disk (performed on the master node) [[email protected] drbd.d]# drbdadm --  --overwrite-data-of-peer primary r0[[email protected] drbd.d]# watch -n  2 cat /proc/drbd  2 seconds Refresh version: 8.4.3 (api:1/proto:86-101) GIT-hash :89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected],  2016-12-0413:39:22 0: cs:syncsource ro:primary/secondaryds:uptodate/inconsistent c  R-----    ns:116024 nr:0 dw:0 dr:123552 al:0 bm:7lo:0 pe:1  ua:7 ap:0 ep:1 wo:f oos:10374340          [&gt, .............]  sync ' ed:  1.2%  (10128/  Data Synchronization test (the first 6 steps on the primary node, three steps after the operation on the secondary node) [[email protected]  drbd.d]# mkfs.ext4 /dev/drbd0[[email protected] drbd.d]# mkdir /mysqldata[[email protected] drbd.d ]# mount /dev/drbd0 /mysqldata[[email protected] drbd.d]# hostname >  /mysqldata/file     //build test file [[email protected] ~]# umount / dev/drbd0[[email protected] ~]# drbdadm secondary r0  //main descending to secondary  [[email  protected] drbd.d]# drbdadm primary r0  //secondary L [[email protected]  drbd.d]# mount /dev/drbd0 /mysqldata [[email protected] drbd.d]# ls  /mysqldata       //viewing data on the standby node file  lost+found                   //can see the file created    install MySQL: Change the storage location of the MySQL database to a shared directory (both master and slave) [[email protected] ~]# yum -yinstall  Mysql mysql-server[[email protected] ~]# vim/etc/my.cnf2 datadir=/mysqldata/mysql[[email protected] ~]#  chown-r mysql.mysql /mysqldata[[email protected] ] #chkconfig  mysqld off[[ Email protected] ~]#/etc/init.d/mysqld start Note at this time we modified the data directory and its owner and permissions, and sometimes because of this operation caused the database to fail to start, workaround, one, Check to see if your selinux is turned on and turn it off. Second, the/etc/apparmor.d/usr.sbin.mysqld file, there are two lines of content specifies the MySQL use data file path permissions, change can, restart/etc/init.d/apparmor restart.   Database testing because of previous operations, the SERVER-2 node is now reduced to a second [[Email protected] ~]# umount /dev/drbd0 [[email  protected] ~]# drbdadm secondary r0 server-1 the main node [[email protected] ~]#  drbdadm primary r0[[email protected] ~]# mount /dev/drbd0 / Mysqldata on Server-1 to create a library named ACCP, and then master down to prepare, the server-2 to the main view of the library is not synchronized. [[email protected] ~]# service mysqld stop       / Operation of the/server-1 [[email protected] ~]# umount /Operation of the dev/drbd0         //server-1 [[email protected] ~]# Operation of  drbdadm secondary r0       //server-1  [[email  Operation of Protected] drbd.d]# drbdadm primary r0    //server-2 [[email  Operation of the Protected] drbd.d]# mount /dev/drbd0 /mysqldata    //server-2 [[ email protected] drbd.d]# service mysqld start       Operation of the     //server-2 [[email protected] drbd.d]# ls /mysqldata/mysql/ Operation of           //server-2 Accp  ibdata1 ib_ logfile0  ib_logfile1  mysql test   configuration heartbeat: First, configuration ha.cf file (master and slave generally consistent) [[email  protected] ~]# cd /usr/share/doc/heartbeat-3.0.4/[[email protected]]# cp  Ha.cf authkeys haresourcEs/etc/ha.d/[[email protected]]# cd /etc/ha.d/[[email protected] ha.d]# vim  ha.cf29 logfile /var/log/ha-log34 logfacility     local048  keepalive 2                        //How long it takes to detect 56 deadtime 10                         //for a long time after the contact is not considered the other side hung (seconds) 61 warntime 5          How long does the                //not contact the start warning prompt 71  initdead 100                       //is mainly for a period of time reserved after the restart 76 udpport 694                         //UDP Port 121 ucast  eth0192.168.200.102         //fill in each other's IP (master-slave difference points) 157 auto_ failback on                Whether the   //node is switched back after it is repaired 211 node    server-1                  //Node name 212 node     server-2                  Node name 253 respawn hacluster/usr/lib64/heartbeat/ipfail    //program to control IP switching x86_ 64 for should write lib64 Two, configure Haresources file (master-slave consistent) [[EMAIL PROTECTED] HA.D]# VIM HARESOURCESSERVER-1IPADDR: :192.168.200.50/24/eth0:0 drbddisk::r0   filesystem::/dev/drbd0::/mysqldata::ext4         mysqld[[email protected] ha.d]# ln -s /etc/init.d/mysqld /etc/ha.d/ resource.d/mysqld  Three, configuration Authkeys file (master-slave consistent) [[email protected] ha.d]# vim authkeys23  auth 124 1 crc[[email protected] ha.d]# chmod 600  Authkeys authentication: Master-slave node start heartbeat[[email protected] ha.d]# service heartbeat start  To see if the primary node VIP exists [[email protected] ha.d]# ip ainet 192.168.200.50/24  brd192.168.200.255 scope global secondary eth0:0  Verification: First stop the heartbeat service on the server-1, Check if the VIP can be transferred at this time Server-2 's MySQL service is off [[email protected] ha.d]# mysqladmin -uroot ping     //node Operation mysqladmin: connect to serverat  ' localhost '   failederror:  ' Can ' t connect tolocal mysql server through socket  '/var/ Lib/mysql/mysql.sock '   (2) ' check that mysqld isrunning and that the socket:  '/var/lib/mysql/mysql.sock '  exists! [[email protected ] ha.d]# service heartbeat stop    //Master node Operation stopping  high-availabilityservices: done. [[email protected] ha.d]# ip a             //Standby node Operation inet 192.168.0.50/24  brd192.168.0.255 scope global secondary eth0:0[[email protected] ha.d]# mysqladmin -uroot ping    //node operation, found MySQL with the start mysqld is alive     at this time does not have the ability to stop the VIP drift after MySQL, need to add a script implementation, when the discovery of the MySQL service hangs, stop the heartbeat service, to achieve VIP transfer (both sides to perform in the background) [[email  Protected] ~]# vimchk_mysql.sh #!/bin/bashmysql= "/etc/init.d/mysqld" mysqlpid=$ (ps -C  mysqld--no-header | wc -l) if [  $mysqlpid  -eq 0];then          $mysql  start         sleep 3        mysqlpid=$ (ps  -C MYSQLD --NO-HEADER |WC -L)         if [   $mysqlpid  -eq 0 ];then                 /etc/init.d/heartbeat stop                 echo  "heartbeatstopped,please check your  mysql ! "  | tee -a /var/log/messages        fifi



This article is from the "Cool Breeze Blog" blog, please be sure to keep this source http://amunlinux.blog.51cto.com/13112118/1946781

Mysql+drbd+heartbeat

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.