Gossip in Cassandra

Source: Internet
Author: User
Tags cassandra random seed

(i) the role of gossip
The Cassandra Cluster has no central node and each node has the same status, and they maintain the state of the cluster through a protocol called gossip.
By gossip, each node knows which nodes are in the cluster and the state of those nodes, which makes it possible for any node in the Cassandra Cluster to route any key, and any node that is unavailable will not have disastrous consequences.

(ii) Introduction to the Gossip protocol
The scientific name of gossip is called anti-entropy (inverse entropy). ), which is more appropriate for use as synchronization information in scenarios where there is no high consistency requirement. The time when the information is synchronized is probably log (n), where n represents the number of nodes.
There are two forms of gossip: Anti-entropy and rumor-mongering.
Each node in the gossip maintains a set of states that can be represented by a key/value pair, with a version number and a large version number for the updated state.
There are 3 ways to handle messages, Cassandra in a third way--push-pull-gossip

(iii) How gossip messages can be sent
when a node is started, get the seeds configuration in the configuration file (Cassandra.yaml) to know all the seed nodes in the cluster.
Cassandra has a gossiper that runs every second (in the Gossiper.java Start method) and sends a sync message to the other nodes according to the following rules:
1, randomly take a currently alive node and send it a sync request
2, send a sync request to a random machine that cannot be reached
3, if the selected node in the first step is not seed, or if the number of nodes currently alive is less than the seed number, send a synchronization request to a random seed
without this judgment, consider a scenario where there are 4 machines, {A, B, C, D}, and they are all seed, if they start at the same time, this may happen:
1, a node up, found no living node, go to step three, and any one of the seed synchronization, assuming that B
2, b node and a complete synchronization, then think a alive , it synchronizes with a, and because a is a seed, B will no longer synchronize with other seeds
3, c node up, found no live node, also go to the third step, and any one of the seeds synchronization, assuming that this time the choice of D
4, c node and D to complete synchronization, think D alive, then it will and D synchronization, Because D is also a seed, so C is no longer synchronized with other seeds.
This is the formation of two islands, A and B synchronized, C and D are synchronized, but {a,b} and {c,d} will no longer sync with each other, they do not know the other side of the existence.
After the second judgment is added, A and b are synchronized, finding that only one node is alive, but there are 4 of seed, and then you will communicate with any seed to break the island.

(iv) Gossip data structure in Cassandra
The state information of gossip communication is mainly 3 kinds:
1, Endpointstate
2, Heartbeatstate
3, ApplicationState
The heartbeatstate is made up of generation and version, and generation changes every time it starts, to distinguish the state before and after the machine restarts; version is only grown, incremented each time before the heartbeat.
ApplicationState is used to represent the state of the system, composed of States and version, where State represents the status of the node, version is incremented, and each object represents a state of the node, such as the state of the current load, presumably: (1.2, 20), The load for this node is 1.2 if the version number is 20 o'clock
Endpointstate encapsulates a map of the applicationstate composition of a node (map<string, applicationstate> applicationstate_) and Heartbeatstate
The state of a node itself can only be modified by itself, and the state of the other nodes can only be updated by synchronization.

(vi) What are the gossip status information
Load Information (load-information)
Migration information (migration)
Node state information (move)
Boot (boot phase) node is starting
The normal (normal) node joins the token ring to provide read
Leaving, the node is ready to leave the ring.
Left, the node is kicked out of the cluster or the token information is manually changed

(vii) Gossip message synchronization process

(eight) gossip message synchronization instance
(1) Node 10.0.0.1 (ENDPOINTSTATEMAP):

[XHTML]  View Plain  copy endpointstate 10.0.0.1     heartbeatstate: generation  1259909635, version 325     ApplicationState  "Load-information":  5.2, generation 1259909635, version 45     ApplicationState  " bootstrapping ": bxlpassf3xd8kyks, generation 1259909635, version 56      ApplicationState  "normal":  bxlpassf3xd8kyks, generation 1259909635, version  87   endpointstate 10.0.0.2     HeartBeatState: generation  1259911052, version 61     ApplicationState  "load-information":  2.7,  generation 1259911052, version 2     ApplicationState  " Bootstrapping ": aujdmftpyuvebtnn, generation 1259911052, version 31   Endpointstate 10.0.0.3     HeartBeatState: generation 1259912238, version 5      ApplicationState  "Load-information": 12.0, generation 1259912238, version  3   endpointstate 10.0.0.4     HeartBeatState: generation  1259912942, version 18     ApplicationState  "load-information":  6.7,  generation 1259912942, version 3     ApplicationState  "normal":  bj05IVc0lvRXw2xH, generation 1259912942, version 7  

(2) Node 10.0.0.2 (ENDPOINTSTATEMAP):

[C-sharp]  View Plain  copy endpointstate 10.0.0.1     heartbeatstate: generation  1259909635, version 324     ApplicationState  "Load-information":  5.2, generation 1259909635, version 45     ApplicationState  " bootstrapping ": bxlpassf3xd8kyks, generation 1259909635, version 56      ApplicationState  "normal":  bxlpassf3xd8kyks, generation 1259909635, version  87   endpointstate 10.0.0.2     HeartBeatState: generation  1259911052, version 63     ApplicationState  "load-information":  2.7,  generation 1259911052, version 2     ApplicationState  " bootstrapping ": aujdmftpyuvebtnn, generation 1259911052, version 31      applicationstate&nBSP; " Normal ": aujdmftpyuvebtnn, generation 1259911052, version 62   EndPointState  10.0.0.3     heartbeatstate: generation 1259812143, version 2142      ApplicationState  "load-information":  16.0, generation 1259812143,  version 1803     ApplicationState  "Normal": w2u1xyuc3wmppcy7,  generation 1259812143, version 6  
Gossipdigestsynmessage (node 10.0.0.1):
10.0.0.1:1259909635:325
10.0.0.2:1259911052:61
10.0.0.3:1259912238:5
10.0.0.4:1259912942:18

Gossipdigestackmessage (node 10.0.0.2):
10.0.0.1:1259909635:324
10.0.0.3:1259912238:0
10.0.0.4:1259912942:0
10.0.0.2:
[ApplicationState "normal": AUJDMFTPYUVEBTNN, Generation 1259911052, version 62],
[Heartbeatstate, Generation 1259911052, version 63]

Gossipdigestack2message (node 10.0.0.1):
10.0.0.1:
Heartbeatstate:generation 1259909635, Version 325
ApplicationState "Load-information": 5.2, Generation 1259909635, version 45
ApplicationState "bootstrapping": Bxlpassf3xd8kyks, Generation 1259909635, version 56
ApplicationState "Normal": Bxlpassf3xd8kyks, Generation 1259909635, version 87
10.0.0.3:
Heartbeatstate:generation 1259912238, version 5
ApplicationState "Load-information": 12.0, Generation 1259912238, version 3
10.0.0.4:
Heartbeatstate:generation 1259912942, version 18
ApplicationState "Load-information": 6.7, Generation 1259912942, version 3
ApplicationState "Normal": bj05ivc0lvrxw2xh, Generation 1259912942, version 7

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.