Iptables causes a firewall to crack the brain

Source: Internet
Author: User
Tags signal handler

In the application of heartbeat to the production environment, there are many places to pay attention to, inadvertently may lead to heartbeat can not switch or brain fissure situation, the following to introduce the phenomenon of iptables caused by brain fissure.

Master: 192.168.3.218

192.168.4.218 Heartbeat IP

usvr-218 Host name

Preparation: 192.168.3.128

192.168.4.128 Heartbeat IP

USVR-128 Host Name


Phenomenon: When the heartbeat Master is started, the VIP takes effect on 218, and then the heartbeat is activated, and the VIP is also in effect at 128;

Solution Ideas:

1. View the logs of the host and the standby machine

Host 218 logs are as follows (only partial logs are listed):

HEARTBEAT[27330]: 2015/01/27_09:05:29 error:message hist queue is filling up ($ messages in queue)
HEARTBEAT[27330]: 2015/01/27_09:05:30 error:message hist queue is filling up ($ messages in queue)
HEARTBEAT[27330]: 2015/01/27_09:05:30 error:message hist queue is filling up ($ messages in queue)
HEARTBEAT[27330]: 2015/01/27_09:05:31 error:message hist queue is filling up ($ messages in queue)
HEARTBEAT[27330]: 2015/01/27_09:05:32 error:message hist queue is filling up ($ messages in queue)
HEARTBEAT[27330]: 2015/01/27_09:05:32 error:message hist queue is filling up ($ messages in queue)
HEARTBEAT[27330]: 2015/01/27_09:05:33Warn:node Usvr-128:is Dead
HEARTBEAT[27330]: 2015/01/27_09:05:33 info:cancelling pending standby operation
HEARTBEAT[27330]: 2015/01/27_09:05:33 info:dead node usvr-128 gave up resources.
HEARTBEAT[27330]: 2015/01/27_09:05:33 info:all clients is now resumed
HEARTBEAT[27330]: 2015/01/27_09:05:33 error:lowseq cannnot be greater than ackseq
HEARTBEAT[27330]: 2015/01/27_09:05:33 info:hist->ackseq =74575, old_ackseq=0
HEARTBEAT[27330]: 2015/01/27_09:05:33 info:hist->lowseq =74576, hist->hiseq=74824, send_cluster_msg_level=1
HEARTBEAT[27333]: 2015/01/27_09:05:34 crit:emergency shutdown:master Control process died.
HEARTBEAT[27333]: 2015/01/27_09:05:34 crit:killing pid 27330 with SIGTERM
HEARTBEAT[27333]: 2015/01/27_09:05:34 crit:killing pid 27334 with SIGTERM
HEARTBEAT[27333]: 2015/01/27_09:05:34 crit:killing pid 27335 with SIGTERM
HEARTBEAT[27333]: 2015/01/27_09:05:34 crit:killing pid 27336 with SIGTERM
HEARTBEAT[27333]: 2015/01/27_09:05:34 crit:killing pid 27337 with SIGTERM
HEARTBEAT[27333]: 2015/01/27_09:05:34 crit:emergency Shutdown (MCP dead): killing ourselves.

Standby 128 logs are as follows (only partial logs are listed):

Jan 10:11:35 Heartbeat: [15999]: Info:glib:ucast:bound receive socket to Device:eth0
Jan 10:11:35 Heartbeat: [15999]: Info:glib:ucast:set so_reuseport (W)
Jan 10:11:35 Heartbeat: [15999]: info:glib:ucast:started on Ports 694 interface eth0 to 192.168.4.218
Jan 10:11:35 Heartbeat: [15999]: info:glib:ping Heartbeat started.
Jan 10:11:35 Heartbeat: [15999]: info:G_main_add_TriggerHandler:Added Signal Manual Handler
Jan 10:11:35 Heartbeat: [15999]: info:G_main_add_TriggerHandler:Added Signal Manual Handler
Jan 10:11:35 Heartbeat: [15999]: info:G_main_add_SignalHandler:Added signal handler for signal 17
Jan 10:11:35 Heartbeat: [15999]: info:local status now set to: ' Up '
Jan 10:11:35 Heartbeat: [15999]: Info:link 192.168.3.1:192.168.3.1 up.
Jan 10:11:35 Heartbeat: [15999]: Info:status Update for node 192.168.3.1:status Ping
Jan 10:13:35 Heartbeat: [15999]:Warn:node Usvr-218:is Dead
Jan 10:13:35 Heartbeat: [15999]: info:comm_now_up (): Updating status to Active
Jan 10:13:35 Heartbeat: [15999]: info:local status now set to: ' Active '
Jan 10:13:35 Heartbeat: [15999]: info:starting child Client "/usr/lib64/heartbeat/ipfail" (498,498)
Jan 10:13:35 Heartbeat: [15999]: Warn:no STONITH device configured.
Jan 10:13:35 Heartbeat: [15999]: warn:shared disks is not protected.
Jan 10:13:35 Heartbeat: [15999]: Info:resources being acquired from localsv218.

As shown above, both sides check the other's node dead, thus taking over the VIP, causing the brain to crack.

2. Preliminary determination is due to the main and prepare the two sides can not communicate or network delay caused by the time is not synchronized, although the time is different heartbeat less impact, but a lot of difference, there will certainly be problems, so the two sides on the time.

/usr/sbin/ntpdate ntp.api.bz&&hwclock-w

echo "0 * * * * root/usr/sbin/ntpdate ntp.api.bz&&hwclock-w >/dev/null 2>&1" >>/etc/crontab

3. When finished, still reported errors in the log, check the main configuration file again, found that there is no problem, the only difference is that there is a firewall on the main standby, because the heartbeat is set by the UDP 694 port communication, so UDP 694

The port was spared in the fire wall.

On the main 218, add:

/sbin/iptables-a INPUT-I eth0 - p udp-s 192.168.4.128--dport 694 -m comment--comment "heart Beat-slave "-j ACCEPT

On standby 128, add:

/sbin/iptables-a INPUT-I eth0 - p udp-s 192.168.4.218--dport 694 -m comment--comment "heart Beat-master "-j ACCEPT

Note: 1. If the firewall policy is strict, the heartbeat IP should be spared, or the UDP communication will still fail

2. Network adapter for the heartbeat IP


After the firewall configuration, the main standby can communicate normally, the main node takes over the VIP work, when the primary node down or the primary node of the heartbeat service is stopped, the standby node will take over the VIP

Iptables causes a firewall to crack the brain

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.