MySQL + drbd + heartbeat for High Availability

Source: Internet
Author: User

1. What is drbd?
Drbd (Distributed replicated block device), known as "Network raid", is an open-source software developed by linbit.
2. Main Functions of drbd
Drbd is actually a block device implementation and is mainly used in high availability (HA) solutions on the Linux platform. It is composed of kernel modules and related programs. It synchronizes the entire device through network communication, which is somewhat similar to a network raid function. That is to say, when you write data to a file system on a local drbd device, the data will be sent to another host on the network at the same time, it is recorded in a file system in the same form (in fact, the creation of a file system is also implemented by the synchronization of drbd ). Data on the local node (host) and remote node (host) can be synchronized in real time, and Io consistency can be ensured. Therefore, when the host on the local node fails, the host on the remote node will retain the same data, which can be used for high availability.
3. main applications of drbd
If the primary server goes down, the loss is immeasurable. To ensure uninterrupted services on the master server, you need to implement redundancy on the server. Among the numerous solutions for implementing server redundancy, heartbeat provides us with a cheap and Scalable high-availability cluster solution. We use heartbeat + drbd to create a high-availability (HA) Cluster Server in Linux and use the drbd function in the high-availability (HA) solution, instead of using a shared disk array storage device. Because the data is stored on both the local host and remote host, when switching is required, the remote host can continue to provide services by using the backup data above it.
4. Relationship between drbd and MySQL
MySQL and linbit reached a cooperative relationship, and launched the "12-day scale-out" activity with great fanfare. It is also driven by commercial cooperation. Drbd assists MySQL and claims to be able to obtain FOUR 9 reliability, which is no less than any commercial database software.
The emergence of drbd has indeed greatly improved the availability of MySQL clusters. Moreover, it has unique characteristics and is very suitable for Internet-oriented applications. Because it is data block synchronization at the storage layer, it is easy to achieve I/O load balancing at the application layer (the slave machine bears certain read pressure), not only does it support database failures, it can also take over failed IP addresses and take over for less than 30 seconds. This is an excellent cluster solution for the poor.
The test environment mentioned in this article is:
Operating System:
Red Hat Enterprise Linux as Release 4 (nahant Update 4)
Software:
Drbd-8.2.6.tar.gz
Heartbeat-2.1.3-3.el4.centos
Heartbeat-pils-2.1.3-3.el4.centos
Heartbeat-stonith-2.1.3-3.el4.centos
Mysql-5.1.26-rc-linux-i686-icc-glibc23.tar.gz
Host environment:
Drbd host list IP address Host Name
HOST 1 (primary) 192.168.1.241 drbd-1
Host 2 (secondary) 192.168.1.242 drbd-2
In addition, both hosts reserve a blank partition:/dev/sdb1, and no file system needs to be created.
1. Compile and install drbd and heartbeat
Install drbd on both master and slave machines.
[Root @ drbd-1 ~] Tar-xvzf drbd-8.2.6.tar.gz
[Root @ drbd-1 ~] CD drbd-8.2.6 & make rpm
[Root @ drbd-1 ~] CD Dist/RPMS/i386
[Root @ drbd-1 ~] Ls
Drbd-8.2.6-3.i386.rpm
Drbd-debuginfo-8.2.6-3.i386.rpm
Drbd-km-2.6.9_42.EL-8.2.6-3.i386.rpm
[Root @ drbd-1 ~] Rpm-IVH drbd-8.2.6-3.i386.rpm
[Root @ drbd-1 ~] Rpm-IVH drbd-debuginfo-8.2.6-3.i386.rpm
[Root @ drbd-1 ~] Rpm-IVH drbd-km-2.6.9_42.EL-8.2.6-3.i386.rpm
[Root @ drbd-1 ~] Yum install heartbeat
Download and save the software package for Yum upgrade and installation in:/var/Cache/Yum/extras/Packages
It's too easy to install MySQL!
2. Load the drbd Module
[Root @ drbd-1 ~] Modprobe drbd
[Root @ drbd-1 ~] Lsmod | grep drbd
Drbd 242924 2
If yes, it indicates the operation is successful !!!
3. Configure/etc/drbd. conf.
Edit the configuration file. The content on the two hosts is the same. The content is as follows:
# Let linbit collect the current usage of drbd, and yes will participate.
Global {
Usage-count yes;
}
# Common among multiple resources managed by drbd. It is mainly used to configure all resources of drbd to be set to the same parameter item, such as protocol and syncer.
Common {
Syncer {rate 100 m ;}
}
# Create a resource named "DB"
Resource dB {
# Use the Protocol C. After receiving the write confirmation from the remote host, the write is considered complete.
Protocol C;
Startup {
WFC-timeout 0;
Degr-WFC-Timeout 120;
}
# Because the hard disks of the two servers may vary in the experimental environment, you need to set the size of drbd.
Disk {
On-io-error detach;
Size 6 GB;
}
Net {
Max-Buffers 2048;
Ko-count 4;
}
Syncer {
Rate 100 m;
}
# Set a node named by its host name
On drbd-1 {
# Set the resource device/dev/drbd0 to point to the actual physical partition/dev/sdb1
Device/dev/drbd0;
Disk/dev/sdb1;
# Set the listening address and port

Address 192.168.1.241: 8888;
# Set the metadata storage method: You can use internal (that is, save the metadata under the same physical partition)
# It can also be stored in other partitions
Meta-disk internal;
}
On drbd-2 {
Device/dev/drbd0;
Disk/dev/sdb1;
Address 192.168.1.242: 8888;
Meta-disk internal;
}
}
4. Start drbd
Before starting the service, you need to create data blocks for metadata storage in the/dev/sdb1 partition on the two hosts:
[Root @ drbd-1 ~] Drbdadm create-MD DB
[Root @ drbd-2 ~] Drbdadm create-MD DB
Enter "yes" twice. If the following prompt is displayed, it indicates the operation is successful.
[Root @ drbd-1/] # drbdadm create-MD DB
Md_offset 8587153408
Al_offset 8587120640
Bm_offset 8586858496
Found ext3 filesystem which uses 6291456 KB
Current configuration leaves usable 8385604 KB
==> This might destroy existing data! Wait forever)
To abort waiting enter 'yes' [47]:
At this time, the drbd services of the two machines are all up and check whether the process exists:
[Root @ drbd-1/] # ps aux | grep drbd
Root 3758 14.5 0.0 0 0? S [drbd0_worker]
Root 3762 9.6 0.0 0 0? S [drbd0_receiver]
Root 3787 2.4 0.0 0 0? S [drbd0_asender]
Root 3794 0.0 0.2 644 128 pts/0 r + grep drbd
We can see that the processes on both nodes are up, and each drbd device has three processes: drbd0_worker is the main city of drbd0, and drbd0_asender is the data sending process of drbd0 on primary, drbd0_receiver is the data receiving process of drbd0 on secondary.
Check the drbd status after startup:
[Root @ drbd-1/] # Cat/proc/drbd
Version: 8.2.6 (API: 88/proto: 86-88)
Git-Hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by root @ drbd-1, 2008-09-17 17:46:45
0: CS: connected ST: secondary/secondary DS: inconsistent/inconsistent C r ---
NS: 0 Nr: 0 DW: 0 Dr: 0 Al: 0 BM: 0 lo: 0 PE: 0 UA: 0 AP: 0 OOS: 6291456
[Root @ drbd-2/] # Cat/proc/drbd
Version: 8.2.6 (API: 88/proto: 86-88)
Git-Hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by root @ drbd-2, 2008-09-17 17:51:50
0: CS: connected ST: secondary/secondary DS: inconsistent/inconsistent C r ---
NS: 0 Nr: 0 DW: 0 Dr: 0 Al: 0 BM: 0 lo: 0 PE: 0 UA: 0 AP: 0 OOS: 6291456
Note: At this time, both servers are in slave node status (ST: secondary/secondary), because no master node is specified yet.
Then, set a starting node as the master node, we set the drbd-1 as the master node:
[Root @ drbd-1/] # drbdadm primary DB
State Change failed: (-2) refusing to be primary without at least one uptodate Disk
Command 'drbdsetup/dev/drbd0 primary' terminated with exit code 11
[Root @ drbd-1/] # drbdsetup/dev/drbd0 primary-o
It can be seen that the drbdadm command fails when the master node is set for the first time. Therefore, drbdsetup is used first, and then drbdadm can be used.
Check the drbd status of the two servers again:
[Root @ drbd-1/] # Cat/proc/drbd
Version: 8.2.6 (API: 88/proto: 86-88)
Git-Hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by root @ drbd-1, 2008-09-17 17:46:45
0: CS: syncsource ST: primary/secondary DS: uptodate/inconsistent C r ---
NS: 3483280 Nr: 0 DW: 0 Dr: 3491456 Al: 0 BM: 212 lo: 1 PE: 8 UA: 256 AP: 0 OOS: 2808416
[===========> ......] Sync 'ed: 55.5% (2742/6144) m
Finish: 0:11:24 speed: 4,084 (4,648) K/sec
[Root @ drbd-2/] # Cat/proc/drbd
Version: 8.2.6 (API: 88/proto: 86-88)
Git-Hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by root @ drbd-2, 2008-09-17 17:51:50
0: CS: synctarget ST: secondary/primary DS: inconsistent/uptodate C r ---
NS: 0 Nr: 3556832 DW: 3556832 Dr: 0 Al: 0 BM: 217 lo: 1 PE: 2464 UA: 0 AP: 0 OOS: 2734624
[===========> ......] Sync 'ed: 56.7% (2670/6144) m
Finish: 0:07:35 speed: 5,856 (4,128) K/sec
Now we can see that data synchronization has started. The first Synchronization After setting takes a long time, because you need to synchronize all the data in the entire partition.
After the first synchronization, you can create a file system for the drbd device:
[Root @ drbd-1/] # mkfs. ext3/dev/drbd0
Mount the file system:
[Root @ drbd-1/] # Mount/dev/drbd0/drbddata
Test Data Writing on the master node:
[Root @ drbd-1 drbddata] # ll
Total 4
Drwx ------ 4 MySQL Root 4096 Oct 13 MySQL
Then, we downgrade primary to secondary, and upgrade secondary to primary:
[Root @ drbd-1/] # umount/drbddata/
[Root @ drbd-1/] # drbdadm secondary DB
Before downgrading primary, you must first umount the device. Then upgrade secondary:
[Root @ drbd-2/] # drbdadm primary DB
[Root @ drbd-2/] # Mount/dev/drbd0/drbddata/
[Root @ drbd-2 drbddata] # ll
Total 4
Drwx ------ 4 MySQL Root 4096 Oct 13 MySQL
We can see that the data has been completely synchronized.
The integration with heartbeat is about to begin. The previous heartbeat has been installed, and you only need to modify the configuration file,
CP/usr/share/doc/heartbeat-2.1.3/ha. Cf.
CP/usr/share/doc/heartbeat-2.1.3/authkeys.
CP/usr/share/doc/heartbeat-2.1.3/haresources.
Configure ha. CF (main configuration file of HA ):
[Root @ drbd-1 ha. d] # more ha. cf
Logfile/var/log/ha-Log
Logfacility local0
Keepalive 2
Deadtime 30
Warntime 10
Initdead 120
Udpport 694
Bcast eth0
Auto_failback off
Node drbd-1
Node drbd-2
Ping_group group1 192.168.1.1 192.168.1.254
Respawn root/usr/lib/heartbeat/ipfail
Apiauth ipfail gid = root uid = root
Configure authkeys authentication:
[Root @ drbd-1 ha. d] # More authkeys
Auth 1
1 CRC
Configure the haresources resource file:
Drbd-1 drbddisk filesystem:/dev/drbd0:/drbddata: ext3 MySQL 192.168.1.243
Note:
The Resource Group configuration file mainly refers to the various resources that need to be managed during the configuration switching process. A key point is that the order of resources in a resource group should be noted, when hearbeat manages resource groups, the process of obtaining resources is handled from left to right, and from right to left when resources are released.
The first column of the Resource Group is one of the nodes in the HA. cf configuration file, and should be the node currently prepared as the primary node.
The meanings of the above resource groups are as follows:
Drbd-1 current primary node name (uname-N)

Drbddisk tells heartbeat to manage drbd Resources
Filesystem indicates that heartbeat needs to manage file system resources. In fact, it is actually executing the Mount/umount command, followed by the ":" symbol followed by the parameter device name and mount point of filesystem)
MySQL tells you to manage MySQL
192.168.1.243 let heartbeat help you manage a service IP address, it will drift along with the master node

Test switch:
1) manually call the heartbeat Node switch script:
Run the/usr/lib/heartbeat/hb_standby script to notify the peer node to change its request to a standby node, and request the peer node to become a primary node. The switchover is completed in about 10 seconds.
2) unplug the network cable and test the switchover after the network of the primary node is disconnected.
After the network cable is unplugged, when the master node finds that it cannot communicate with the standby node, it records the warn information in the log. If the duration reaches ha. after the duration specified in CF, the resource source will be released. The standby node cannot communicate with the master node for a period of time (HA. cf settings), start to start the resource and turn itself into a primary node. In addition to the length set in Ha. Cf, the switching period is also very short.
3) shutdown the primary host and test whether the switchover is normal. Basically, it is similar to test 2 above.
4) primary node power-down test. No actual operation has been performed in the IDC. The test will continue later.
Test results:
1. Normal switchover and complete data.
2. The switchover is normal, but there is a difference between the master and slave data.
3. Normal switchover and complete data.
4. Normal switchover and complete data

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.