RABBITMQ Clustering and failure handling

Source: Internet
Author: User
Tags rabbitmq

The RABBITMQ built-in cluster is designed to accomplish two goals: allowing consumers and producers to continue to run while the RABBITMQ node is crashing, and to linearly scale the throughput of message traffic by adding more nodes. When a RABBITMQ node is lost, the client can connect to any other node in the cluster and continue to produce or consume messages. Similarly, if the RABBITMQ cluster is struggling to cope with a large amount of message traffic, it can increase performance linearly by adding more nodes.

The RABBITMQ cluster does not guarantee that the message is foolproof: Because RABBITMQ does not replicate the contents of the queue to the entire cluster by default. Without a special configuration, these messages exist only on the node to which the queue belongs.

RABBITMQ Cluster architecture

RABBITMQ will always record the following four types of internal metadata:

    • Queue metadata-queue name is other properties
    • Exchanger metadata-exchanger name, type, property
    • Binding metadata-a simple table showing how to route messages to a queue
    • Vhost metadata-provides namespaces and security attributes for queues, switches, and bindings within Vhost

In a single node, RABBITMQ stores These metadata information on the hard disk, and the queues and switches (and their bindings) that are marked as persistent are stored on the hard disk. stored on hard disks and switches and queues re-rebuilt after restarting RABBITMQ. When a cluster is introduced, RABBITMQ needs to keep track of the new metadata type: The cluster node location, and the relationship of the node to the other types of metadata that have been recorded.

Queues in a cluster

In a RABBITMQ cluster, not every node has a full copy of all the queues. If you create a queue in a cluster, the cluster will only create complete queue information (metadata, status, and content) on a single node rather than on all nodes. The result is that only the owner node of the queue knows all the information about the queue. All other non-owner nodes only know the metadata of the queue and pointers to the node that the queue exists in. So when the cluster node crashes, the queue and associated bindings for that node are gone. The consumer attached to the queue also loses the subscription information, and any new messages that match the queue's binding information are also lost. You can re-create the queue by having consumers reconnect to the cluster. However, this approach is only possible when the queue is not set to be persisted at the very beginning.

Why does RABBITMQ not copy the queue contents and state to all nodes by default?

    • Storage space
    • performance, reducing network and disk load.
Distribution Exchanger

A switch is a name and a list of queue bindings. When a message is published to a switch, it is actually a channel connected to the route by which the message is compared by the key to the switch's binding, and then the message is routed.

When creating a new switch, RABBITMQ is to add the query table to all nodes in the cluster.

What happens if the message has been published to the channel, but the node fails before the message is routed?

The AMQP basic.publish command does not return the status of the message. This situation means that the message will be lost. The solution is to use the AMQP transaction, which continues to block until the message is routed to the queue, or to use the Send ACK mode to record that the connection interruption is a message that has not been acknowledged.

Memory node and disk phase

Memory phase: The metadata definitions for all queues, switches, bindings, users, permissions, and Vhost are in memory. The disk node stores the metadata on disk. A single-node system allows only the disk-type nodes: Otherwise, all configuration information about the system will be lost after each restart of RABBITMQ.

RABBITMQ only requires that there is at least one disk node in the cluster, and the other nodes can be memory nodes. When a node joins or leaves a cluster, they must notify the change to at least one disk node. If there is only one disk node, the cluster can continue to route the message (that is, keep running) after the disk node crashes, but it cannot change anything until the node recovers. Typically, you set up two disk nodes in a cluster.

RABBITMQ Clustering and failure handling

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.