RabbitMQ Concept and Environment Construction (iii) RabbitMQ cluster

Last Update:2014-12-12 Source: Internet

Author: User

Tags stack trace

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Test environment: VMS00781 VMS00782 VMS00386 (centos5.8)
1. Install RABBITMQ Server separately on three machines first

2. Read the cookie from one of the nodes and copy it to the other nodes (between the nodes to determine if they can communicate with each other through a cookie)
Either of the two can:
sudo vim/var/lib/rabbitmq/.erlang.cookie
sudo vim $HOME/.erlang.cookie

3. Start nodes individually
sudo service rabbitmq-server start

4. View the RABBITMQ brokers in each node
sudo rabbitmqctl cluster_status

5. Build a cluster
Performed on VMS00386, VMS00782, respectively.
sudo rabbitmqctl Stop_app
sudo rabbitmqctl join_cluster--ram [email protected]
sudo rabbitmqctl Start_app
sudo rabbitmqctl Stop_app
sudo rabbitmqctl join_cluster [email protected]
sudo rabbitmqctl Start_app

6. Troubleshooting
The following errors were encountered during the build cluster:
sudo rabbitmqctl join_cluster--ram [email protected]
clustering node [email protected] with [email protected] ...
Error:unable to connect to nodes [[email protected]: Nodedown
Diagnostics
===========
Attempted to contact: [[email protected]]
[Email protected]:
* Unable to connect to EPMD (Port 4369) on Vms00386:nxdomain (non-existing domain)
Current node Details:
-Node Name: ' [email protected] '
-Home dir:/VAR/LIB/RABBITMQ
-Cookie hash:50yo3zk+hjhos0tab1vhjg==
The way to solve it:
Cluster nodes need to be able to access each other, so each cluster node Hosts file should contain all the nodes in the cluster information to ensure mutual resolution
Vim/etc/hosts
781 ' s IP VMS00781
782 ' s IP VMS00782
386 ' s IP vms00386
Then restart the RABBITMQ in each node

7. Other issues
Error:mnesia_unexpectedly_running
Reason: Forget to stop Stop_app first
FIX: sudo rabbitmqctl stop_app

If the hostname cannot be parsed or changed after the first boot of the rabbitmq-server, it will cause the startup to fail.
You need to do the following:
sudo rm-rf/var/lib/rabbitmq/mnesia (because relevant information is recorded in this database)
Reload RABBITMQ Server

#####################################################
RabbitMQ Cluster Management
#####################################################
1. View cluster status
can be executed separately in each node of the cluster
sudo rabbitmqctl cluster_status

2. Change the node type (in-memory or disk-type)
sudo rabbitmqctl Stop_app
sudo rabbitmqctl change_cluster_node_type disc
Or
sudo rabbitmqctl change_cluster_node_type Ram
sudo rabbitmqctl Start_app

3. Restart the node in the cluster
Stop a node or node from dropping the remaining nodes unaffected
[[email protected] ~]$ sudo rabbitmqctl stop
Stopping and halting node [email protected] ...

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected],[email protected]}]},
{running_nodes,[[email protected],[email protected]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected],[email protected]}]},
{running_nodes,[[email protected],[email protected]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

[[email protected] ~]$ sudo rabbitmqctl stop
Stopping and halting node [email protected] ...

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected],[email protected]}]},
{Running_nodes,[[email protected]]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

Automatically catch up with other nodes after the node restarts
[[email protected] ~]$ sudo service rabbitmq-server start
Starting rabbitmq-server:success
Rabbitmq-server.

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected],[email protected]}]},
{running_nodes,[[email protected],[email protected]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

[[email protected] ~]$ sudo service rabbitmq-server start
Starting rabbitmq-server:success
Rabbitmq-server.

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected],[email protected]}]},
{running_nodes,[[email protected],[email protected],[email protected]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected],[email protected]}]},
{running_nodes,[[email protected],[email protected],[email protected]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected],[email protected]}]},
{running_nodes,[[email protected],[email protected],[email protected]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

Some notes:
Ensure that there is at least one disk type node in the cluster to prevent data loss, especially when changing the node type.
If the entire cluster is stopped, you should ensure that the last node that was dropped is first started, and if not, you can use the Forget_cluster_node command to move it out of the cluster
If a node in the cluster is down almost simultaneously in an uncontrolled manner, the node is restarted at one of the nodes using the Force_boot command

4. Removing nodes from a cluster
[email protected] ~]$ sudo rabbitmqctl stop_app
stopping node [email protected] ...
[[email protected] ~]$ sudo rabbitmqctl reset
resetting node [email protected] ...
[email protected] ~]$ sudo rabbitmqctl start_app
starting node [email protected] ...

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{Nodes,[{disc,[[email protected]}]},
{Running_nodes,[[email protected]]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected]}]},
{running_nodes,[[email protected],[email protected]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

[Op1[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{nodes,[{disc,[[email protected],[email protected]}]},
{running_nodes,[[email protected],[email protected]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]
visible [email protected] became a separate node, the original cluster only left [email protected],[email protected]

You can also remove other nodes in a cluster from a node
If you continue to remove [email protected] on [email protected]
[email protected] ~]$ sudo rabbitmqctl forget_cluster_node [email protected]
removing node [email protected] from cluster ...

[email protected] ~]$ sudo rabbitmqctl cluster_status
Cluster Status of node [email protected] ...
[{Nodes,[{disc,[[email protected]}]},
{Running_nodes,[[email protected]]},
{cluster_name,<< "[email protected]" &GT;&GT;},
{partitions,[]}]

Visible cluster only [email protected] a node

Here's the problem, the node that was removed from the remote other node would think it was still part of the cluster

[email protected] ~]$ sudo rabbitmqctl start_app
starting node [email protected] ...
BOOT FAILED
===========
Error Description:
{error,{inconsistent_cluster, "Node [email protected] thinks it ' s clustered with Node [email protected], but [email protect Ed] disagrees "}}
Log files (may contain more information):
/var/log/rabbitmq/[email protected]
/var/log/rabbitmq/[email protected]
Stack Trace:
[{rabbit_mnesia,check_cluster_consistency,0},
{rabbit, '-start/0-fun-0-', 0},
{rabbit,start_it,1},
{RPC, '-handle_call_call/6-fun-0-', 5}]
Error: {rabbit,failure_during_boot,
{error,
{Inconsistent_cluster,
"Node [email protected] thinks it ' s clustered with Node [email protected], but [email protected] disagrees"}}}
Need to reset
[[email protected] ~]$ sudo rabbitmqctl reset
resetting node [email protected] ...
[email protected] ~]$ sudo rabbitmqctl start_app
starting node [email protected] ...

At this point three nodes have become independent nodes
where [email protected], [email protected] are reset to the new RABBITMQ broker and [email protected] Also retains the original cluster residual status can be reset by the following steps
[email protected] ~]$ sudo rabbitmqctl stop_app
stopping node [email protected] ...
[[email protected] ~]$ sudo rabbitmqctl reset
resetting node [email protected] ...
[email protected] ~]$ sudo rabbitmqctl start_app
starting node [email protected] ...

5. Automatic Configuration Cluster
Obviously, this is done through configuration files rather than command-line tools.
Reset each node first
[email protected] ~]$ sudo rabbitmqctl stop_app
stopping node [email protected] ...
[[email protected] ~]$ sudo rabbitmqctl reset
resetting node [email protected] ...
...
Second, adjust the configuration file
[{rabbit,
[{cluster_nodes, {[' [email protected] ', ' [email protected] ', ' [email protected] ', disc}]}].
...
Then start each node
[[email protected] ~]$ sudo service rabbitmq-server start
Starting rabbitmq-server:success
Rabbitmq-server.

View cluster status
[email protected] ~]$ sudo rabbitmqctl cluster_status

Some notes:
Make sure that the Erlang and RABBITMQ versions are consistent on each node, whether through the command line or through configuration file configuration
The configuration file is only valid for fresh nodes, that is, reset, or the node that was started for the first time. Therefore, automating the cluster process does not occur after restarting the node. It is also assumed that this change through RABBITMQ takes precedence over the Automation cluster configuration.

Deploying clusters on a single machine, general user test cluster features
The key here is that different ports can start multiple Rabbitmq-server instances with node names, and the rest of the process is similar to deploying clusters on multiple machines

Other precautions:
such as firewall policies, etc.

Reference:
Http://www.rabbitmq.com/clustering.html

RabbitMQ Concept and Environment Construction (iii) RabbitMQ cluster

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More