Cassandra Tutorials (4)----Inter-node interaction (gossip)

Last Update:2016-04-04 Source: Internet

Author: User

Tags cassandra

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Gossip is a peer -to-peer protocol that is used by nodes in a Cassandra cluster to exchange information between them. The process named gossip has more than 3 nodes exchanging data per second in the cluster , because the nodes exchange information between themselves and related nodes, so the node can quickly know the other nodes in the cluster (that is, a spread like wildfire, The concept of hundred). gossip messages are versioned, and when a message is exchanged, the old information is overwritten. To prevent divergence during gossip communication, all nodes in the cluster need to use the same " seed node "-The key point when the node first starts.

Note: In multiple data center clusters,the seed list should have at least one in each Datacenter . It is recommended that more than 1 seed node be configured in each data centerfor fault tolerance. Otherwise gossip must interact with other data center when a node is started. It is not recommended to mark each node as a seed node, as this will increase maintenance and reduce gossip performance. We recommend using fewer seed lists ( three nodes per data center)

fault detection and recovery

fault detection is through gossip status and historical information for local detection of whether other nodes are hanging off a method. cassandra cassandra

gossip process through direct (node through gossip cassandra Calculates the thresholds for each node by counting the natural detection mechanisms for network performance, load, and historical information. When the gossip when swapping, each node maintains a cluster other nodes in the gossip the sliding window for the time interval at which the message arrives. Configure phi_convict_threshold phi_convict_ Threshold node failure. The default value of

can be used in most cases

node failure may be caused by a hardware failure or a network outage. Node interrupts are usually short-lived, and of course there are long periods of time. That's because the interruption of the node rarely marks itself permanently out of the cluster --The node does not automatically say I want to leave the cluster forever. The other node will periodically attempt to reconnect the failed node to see if it is online. To permanently change the node's eligibility in the cluster, the administrator should explicitly Add or remove nodes using the Nodetool tool.

when the node is hung up, the replica data that should be managed by it is usually lost. When a failure detector is marked with a node hanging, the missed write operation ishinted handoffin thestorage to other replicas over time. If a node hangs longer thanMax_hint_window_in_msthe time of the flag (default3hours),hintsThe number is no longer saveddata. The hung node may have stored the non-releasedhints. When recovering a long-hanging node, run a recovery job. In addition, administrators should routinely run on each nodeNodetool Repairensure consistency of data.

This article is from the Java Architect's Road blog, so be sure to keep this source http://eric100.blog.51cto.com/2535573/1759950

Cassandra Tutorials (4)----Inter-node interaction (gossip)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More