Understanding of the vector clock algorithm

Source: Internet
Author: User

Reprinted: http://www.kongch.com/2011/08/vector-clock-understanding/

Vector clock is an algorithm used by Amazon's dynamo to capture the causal relationship between objects of different versions. According to dyanmo paper, the vector clock is actually a (node, counter) pair list (that is, a (node, counter) List ). The vector clock is associated with each version of each object. By reviewing its vector clock, we can determine whether two versions of an object are parallel branches or have a causal order. If the counter on the first clock object is smaller than or equal to the counter of all other nodes on the second clock object, the first is the second ancestor and can be ignored. Otherwise, the two changes are considered as conflicting and require coordination.

Is it a bit dizzy? For understanding, I gave an example:

There is now a mobile phone Mall, where the price of the iPhone is changing, and several editors throughout the country constantly update their iPhone prices. Of course, users are constantly asking about the current iPhone price.

Assume that the mall has three nodes, A, B, and C, then our n is 3.

We want to write only one copy W = 1, then according to W + r> n there are r = 3. There are the following scenarios:

  1. First, a received a request that the iPhone price is 4000. We now have 4000 [A: 1];
  2. Before data is copied to B and C, someone told a that the price was increased to 4500. then there is 4500 [A: 2] On A, which overwrites the previous [A: 1]
  3. Then the price was copied to B and C. Then there is also 4500 on B and C [A: 2].
  4. At this point, someone told B that the iPhone went up again and changed to 5000, so there would be 5000 [A: 2, B: 1] on B.
  5. On B, this price was copied to A and C. Before that, another request was sent to C, saying that the iPhone was reduced to 3000 RMB!

After the above ups and downs, C has 3000 [A: 2, C: 1]. At this time, on a, it is 4500 [A: 2], and on B, it is 5000 [: 2, B: 1].

Data on all three nodes is inconsistent !!! A little messy ~

According to Murphy's Law-the least thing that ever happens-someone asks about the iPhone price.

Let's see what the function of vector clock can do at this time?

Since our R = 3, data in these three points will be read. Which of the following statements is returned for 4500, 5000, and 3000? Obviously, the version on a is the lowest and should be discarded. What about B and C?

The client obtains 3000 [A: 2, C: 1] and 5000 [A: 2, B: 1, but we can make it have a judgment basis-for example, the timestamp-now the client sees that the data on B is the latest, so the conclusion is 5000.

Although the conclusion has been reached, the subsequent clients cannot be so entangled. The next step is to unify each node and merge the vector clock. The thing to do at this time is to notify node A that the current iPhone price is 5000 and the vector clock based on the value of 5000. in this way, the data on a becomes 5000 [A: 3, C: 1, B: 1]. in this way, if we have read requests, we can select data on a without hesitation.

Let's see if W = 2, R = 2:

  1. A receives 4000, but only the data reaches B. So we have 4000 [A: 1] On A and 4000 [A: 1] on B.
  2. Before being copied to C, someone told a that the price was increased to 4500. the same as a and B would have 4500 [A: 2].
  3. Data is copied to C, and C also has 4500 [A: 2]
  4. At this point, someone told B that the iPhone went up again and changed to 5000, so on B, there would be 5000 [A: 2, B: 1] Same as 1, C has 5000 [A: 2, B: 1]
  5. Another request sent to C said the iPhone had been reduced, and it had changed to 3000! Then there should be 3000 [A: 2, B: 1, C: 1] on C. for the same reason, the new data will be written to A, and after 4500 [A: 2] on a shows 3000 [A: 2, B: 1, C: 1, the unconditional acceptance is overwritten, so it also becomes 3000 [A: 2, B: 1, C: 1].

After the above ups and downs, C has 3000 [A: 2, B: 1, C: 1]. At this time, the value of a is 3000 [A: 2, B: 1, c: 1], where B is 5000 [A: 2, B: 1].

In this case, are we still struggling with the read requests? Although R = 2, no matter which two we read, we will get the price of 5000, because obviously [A: 2, B: 1, C: 1] is better than [: 2, B: 1] is more fresh. In the case of W = 2, no coordination is required.

We can also see that increasing W can reduce conflicts and improve consistency. But the cost is also obvious: writing two copies is obviously slower than writing one, and the probability of writing successfully is also lower-that is, availability is reduced. This is also consistent with the CAP theory.

A possible problem with vector clock is that if many servers coordinate writing to an object, the vector clock size may increase. In fact, this is unlikely, because writing is usually handled by one of the first n nodes in the preferred list. When the network is split or multiple servers fail, write requests may be processed by one of the first n nodes in the list of preferences, resulting in an increase in the vector clock size. In this case, it is worth limiting the vector clock size. Therefore, Dynamo uses the following clock truncation scheme: with each (node, counter) pair, Dynamo stores a timestamp to indicate the last update time. When the number of (node, counter) pairs in the vector clock reaches a threshold (such as 10), the earliest one will be deleted from the clock. Obviously, this truncation scheme leads to a low coordination validity rate, because the relationship between future generations cannot be obtained accurately. However, this problem has not yet occurred in the production environment, so it has not been thoroughly studied.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.