Anti-entropy protocols

Source: Internet
Author: User

Http://highlyscalable.wordpress.com/2012/09/18/distributed-algorithms-in-nosql-databases/, distributed algorithms in nosql Databases

Http://www.cnblogs.com/chen77716/archive/2011/03/24/2130798.html, gossip Algorithm

Gossip, "Efficient reconciliation and Flow Control for anti-entropy protocols"

 

Anti-entropy protocols, gossips

Anti-entropy, OrGossip, Is an attractive way of replicating state that does not have strong consistency requirements.
A highly fault-tolerant and distributed consistent synchronization protocol without strong consistency (or eventual consistency.
Why is it anti-entropy? I haven't understood the meaning of antientropy for a long time. I borrowed the above reference.

The gossip algorithm is also called anti-entropy. entropy is a concept in physics that represents disorder, while entropy seeks consistency in disorder, this fully demonstrates the characteristics of gossip: In a bounded network, each node is randomly communicating with other nodes. After some disorganized communication, the status of all nodes will eventually reach an agreement. Each node may know all other nodes, or only a few neighboring nodes. as long as these nodes can be connected through the network, their statuses will eventually be consistent. Of course, this is also a characteristic of the epidemic.

 

Let us start our study with the following problem statement:

There is a set of nodes and each data item is replicated to a subset of nodes. each node serves update requests even if there is no network connection to other nodes. each node periodically synchronizes its state with other nodes is such a way that if no updates take place for a long time, all replicas will gradually become consistent.How this synchronization shocould be organized-When synchronization is triggered, how a peer to synchronize with is chosen, what is the data exchange protocol?Let us assume that two nodes can always merge their versions of data selecting a newest version or preserving both versions for further application-side resolution.

What is the problem?

This problem appears both in Data Consistency Maintenance and in synchronization of a cluster state (propagation of the cluster membership information and so on ). although the problem above can be solved by means of a global coordinator that monitors a database and builds a global synchronization plan or schedule, decentralized databases take advantageMore fault-tolerant approach. The main idea is to use well-studiedEpidemicProtocols [7] that are relatively simple, provide a pretty good convergence time, and can tolerate almost any failures or network partitions. Although there are different classes of epidemic algorithms, we focus onAnti-entropy protocolsBecause of their intensive usage in nosql databases.

Anti-entropy Protocols assume that synchronization is already med by a fixed schedule-every node regularly chooses another node at random or by some rule and exchanges database contents, resolving differences. there are three flavors of anti-entropy protocols:Push, pull, and push-pull. The idea of the push protocol is to simply select a random peer and push a current state of data to it. in practice, it is quite silly to push the entire database, so nodes typically work in accordance with the protocol which is depicted in the figure below.

This problem can be solved through Global Coordinator, but the decentralized design can provide more fault-tolerant approach design.

In fact, the algorithm is very simple, that isEpidemicProtocols, which is widely used in nosqlAnti-entropy protocols.

Push: Ask B what is different from me. B tells me that I push different parts to B.

Pull: Tell B what I have. B sends me what I don't have.

Push-pull: the above two are combined at the same time, and the two line arrows below the graph are reversed.

 

Anti-entropy protocols provide reasonable good convergence time and scalability. the following figure shows simulation results for propagation of an update in the cluster of 100 nodes. on each iteration, each node contacts one randomly selected peer.

One can see that the pull style provides better convergence than push, and this can be proven theoretically [7]. also, push has a problem with a "convergence tail" when a small percent of nodes remains unaffected during between iterations, although almost all nodes are already touched. the push-pull approach greatly improves efficiency in comparison with the original push or pulls techniques, so it is typically used in practice. anti-entropy is scalable because the average conversion time grows as a logarithmic function of the cluster size.

Although these techniques look pretty simple, there are using Studies [5] regarding performance of anti-entropy protocols under different constraints. one can leverage knowledge of the network topology to replace a random peer selection by a more efficient schema [10]; adjust transmit rates or use advanced rules to select data to be synchronized if the network bandwidth is limited [9]. computation of Digest can also be challenging, so a database can maintain a journal of the recent updates to facilitate digests computing.

How to measure anti-entropy protocols, of course, through convergence Time, certainly the most efficient push-pull

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.