RABBITMQ High Availability Scenario Summary

Source: Internet
Author: User
Tags rabbitmq

The RABBITMQ cluster scenarios include the following:
1. Common clusters
Exchange,buindling will save one copy on all nodes, but the queue will only be stored on one of the nodes, but all nodes will store a copy of the queue's meta information. Because this has two benefits:
1) storage space. If there are all messages on each node, how many nodes will have a copy of the total number of messages. A message that joins a queue takes up 1G of space, so three nodes is 3G
2) performance. Messages that need to be transferred between nodes can have significant network overhead. If the message is set to durable that is persisted, it also adds a large disk load
The node that the queue stores depends on the node that the client that created the queue is connected to at that time. If the producer is connecting to a different node, the message is forwarded to the node where the queue is stored. If the consumer connects to the node of the non-storage queue to fetch data, the data is pulled from the node that stores the message. So:
1) The creation queue is attached to one node, and all the queues are stored on one node.
2) The node that saved the message hangs, consumer can only wait until the node resumes to read the message.
3) Set A A, a, queue data on a: you can produce or consume messages to a or B. But once the message is produced to B, a hangs, and the client does not receive any error messages and can continue sending, while the message is actually discarded. Once the client is hung up, the connection B fails to report that node A does not exist. In the B-read is similar, in the client batch to read the data before the reading is not aware of a is not hanging, wait until the next batch of data read when a hangs will error.
So this clustering approach is characterized by:
1) High throughput
2) non-high availability

2. Mirroring mode
The difference between mirror mode and normal mode is that the data in the queue is mirrored to all nodes. Failure of any of these nodes will not affect the use of the entire cluster.
In implementation, the mirror queue has a set of electoral algorithms that will elect a master, and several slaver. Master and Slaver Check if the connection is broken by constantly sending heartbeats to each other. You can control the frequency of heartbeat checks by specifying Net_ticktime. Note that a unit time Net_ticktime actually does 4 interactions, so when more than Net_ticktime (±25%) If the seconds are not responding, the node is considered dead. Also note that all nodes are required to be consistent when modifying net_ticktime.
Configuration examples:
{rabbit, [{tcp_listeners, [5672]}]},
{kernel, [{net_ticktime, 120}]}
Consumer, arbitrarily connected to a node, if the connection is not master, the request will be forwarded to master, in order to ensure the reliability of the message, consumer reply ack to master, master deletes the message and broadcasts all slaver to delete.
Publisher, with any connection to a node that is not master, is forwarded to master, which is stored by master and forwarded to other slaver stores.
If master hangs, select the longest message queue from slaver as master, in which case the message is not synchronized to the ACK message is not synchronized, causing the message to be re-sent (by default, asynchronous synchronization). In total, there are a few things that happen:

1) the 1 oldest (longest queue) slaver are promoted to master, and if no slaver is synchronized with master, the message is lost.
2) The slaver to be promoted to master will assume that all previously connected users of master have disconnected. Then there is a message that Clinet sent an ACK is still on the way master hangs up, or Master received an ACK but when it is broadcast to slaver, master hangs off, so the new master has no choice but to think that the message is not confirmed. He would requeue the news that he thought there was no ack. Then the client may receive a duplicate message and send an ACK again.

3) The client that was consumed from the mirror queue supported the consumer cancellation notification, and the mirrored-queue that received the notification and subscription was canceled because the Mirrored-queue was upgraded to master, This is the client needs to reproduce to find the mirrored-queue on the consumption, so that the client will continue to send an ACK to the old hang-off master. Avoid receiving the same message as the new master sends.
4) If the noack=true, and consumption on the mirrored-queue, then when the switch because the server is the first ACK and then sent to the consumer of Noack=true, then the disconnection may cause the data loss

If Slaver is hung, the node state of the cluster does not change. As long as the client is not connected to this node, it will not send a failed notification to the client. There is a delay in publish messages when detection of slaver hangs. If a highly available policy is configured for automatic synchronization, when slaver up, there is a large number of messages in the queue that need to be synchronized, which will block the entire cluster from reading and writing for a long period of time until the end of synchronization.
Both of these hangs require client-side mirroring fault tolerance, such as reconnection when the connection is broken (the official Java and. NET clients provide a callback method to invoke when a link fails to be heard. Java provides the Shutdownlistener callback method in the connection and channel classes,. Net The client provides connectionshuedown in the Iconnecton with the Imodelshutdown event for invocation in Imodel). You can also add loadbalancer between the client and server. For example, haproxy do load balancing.

Specify the Mirror policy:
There are three types of strategies:
All: The queue will be mirrored to all nodes in the cluster and will be mirrored to the new node when the new node is added
Exactly (specify count): If the number of nodes is less than count, the queue is mirrored to all nodes. If the number of nodes is greater than count, the new node will no longer create the queue's mirror (even if the node that originally created mirror is hung off and will not be created)
Nodes: Mirror the specified node. If no specified node is running, then only the node that the client connects will declare the queue (there is a migration policy: if the queue is on [a, b] and A is master, if the given new policy is nodes[c,d], then in order to prevent data loss, [A,c,d] will be present in the migration, until C,d is synchronized, A will not close.
Configuration examples:
Set the queue name to ha. For high availability:
Linux:rabbitmqctl set_policy Ha-all "^ha\." ' {' Ha-mode ': ' All '} '
Win:rabbitmqctl set_policy Ha-all "^ha\." "{" "Ha-mode" ":" "All" "}"
HTTP Api:put/api/policies/%2f/ha-all {"pattern": "^ha\.", "definition": {"Ha-mode": "All"}}
Web UI:
1:navigate to Admin > Policies > Add/update a policy.
2:enter "Ha-all" next to Name, "^ha\." Next-to-Pattern, and "ha-mode" = "All" under the first line next to Policy.
3:click ADD Policy.
Example 2:
Rabbitmqctl set_policy ha-two "^two\." \
' {' Ha-mode ': ' Exactly ', ' ha-params ': 2, ' ha-sync-mode ': ' Automatic '} '

Automatic or Manual synchronization:
You can see which slave are in sync:
Rabbitmqctl list_queues name Slave_pids synchronised_slave_pids
You can manually sync (default manual sync):
Rabbitmqctl sync_queue Name
You can cancel auto-sync:
Rabbitmqctl cancel_sync_queue Name
A mirror that is not synchronized, it will still synchronize the subsequent insert queue data, but the data in front of the queue is not. But as the queue continues to consume, the message of the vacant part is consumed, at which point the mirror can also be synchronized.

3. Primary and Standby cluster
Primary and Standby (active,passive) Only one node is in service state, can be combined with pacemaker and ARBD,
Shovel simply consumes a message from one of the broker's queues and forwards the message to another broker's switch.
This is a less-than-used way of doing this. See http://www.rabbitmq.com/pacemaker.html

Ext.: http://houlinyan.iteye.com/blog/2261704

RABBITMQ High Availability Scenario Summary

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.