Theoretical foundations of distributed systems-time, clock, and sequence of events

Source: Internet
Author: User
Tags value store

Number 16th ... April 16. April 16, 1960 three o'clock in the afternoon a minute before you are with me, because of you I will remember this minute. From now on we are a minute friend, it is the fact that you can not change, because it has passed. I'll come back tomorrow.

--The story of a punk

Time in real life is a very important concept, time can record the time when things happen, compare the sequence of things happening. Some scenarios of distributed systems also need to record and compare the sequence of events occurring between different nodes, but unlike daily use of physical clocks to record time, distributed systems use logical clocks to record the sequence of events, and here we look at several common logic clocks in distributed systems.

Physical Clock vs Logic clock

One might ask why the distributed system does not use the physical clock (physical clock) to log events? Each event corresponds to a timestamp, which is better when the order of comparison is to be compared with the corresponding timestamp.

This is because the physical time in real life has a unified standard, and the distributed system in each node recorded time is not the same, even if the NTP time synchronization node is set there is also a millisecond level deviation [note]. Therefore, the distributed system needs to have another method to record the event order relationship, which is the logical clock (logical).

Lamport timestamps

Leslie Lamport introduced the concept of a logical clock in 1978 and described a method for representing a logical clock, known as a Lamport timestamp (Lamport timestamps) [3].

There are three kinds of events in distributed system, such as the presence of nodes, the second is sending events, and the other is receiving events. The Lamport time stamp principle is as follows:

Figure 1:lamport Timestamps Space Time (image source: Wikipedia)

    1. Each event corresponds to a lamport timestamp with an initial value of 0
    2. If the event occurs within a node, the timestamp is added 1
    3. If the event belongs to a Send event, the timestamp is added 1 and the timestamp is taken in the message
    4. If the event belongs to a receive event, the timestamp = Max (local timestamp, timestamp in message) + 1

Suppose that there are events a, b,c (a), C (b) for the Lamport timestamp of the corresponding event A, B, and if C (a) < C (b), then a occurs before B (happened before), denoted as B, as in Example 1, C1-B1. With this definition, events in the event set that Lamport timestamps can be compared and we get the partial order of the events (partial order).

If C (a) = C (b), what is the order of events A and B? Assuming that A and B occur on nodes p and Q respectively, Pi and Qj each indicate the number of p and Q respectively, if C (a) = C (b) and Pi < Qj, the same is defined as a occurs before B, denoted by A and B. If we numbered AI = 1, bj = 2, Ck = 3, for C (B4) = C (C3) and BJ < Ck for Figure 1, then B4 = C3.

With the above definition, we can sort all events and get the full order of events (total order). For example, we can sort from C1 to A4.

Vector Clock

The Lamport time stamp helps us get the sequence of events, but there is also a sequence relationship that cannot be represented well with Lamport timestamps, which is the simultaneous relationship (concurrent) [4]. In Example 1, the event B4 and event C3 have no causal relationship and belong to simultaneous events, but the Lamport timestamp is defined in sequence.

The vector clock is another logical clocking method that evolves on the basis of the Lamport timestamp, which not only logs the Lamport timestamp of the node, but also records the Lamport timestamp of the other nodes [5][6]. The principle of Vector clock is similar to the Lamport timestamp, using the following legend:

Figure 2:vector Clock Space time ( image source: wikipedia)

Suppose that there are events A and b occur on the node P, Q respectively, the Vector clock is Ta, Tb, if TB[Q] > Ta[q] and tb[p] >= ta[p], then a occurs before B, recorded as a-B. So far and Lamport time stamp difference is not big, that vector clock how to discriminate between sex?

If TB[Q] > Ta[q] and Tb[p] < ta[p], then A, B is considered to occur at the same time, recorded as a <-> B. The 4th event (A:2,b:4,c:1) on node B in Example 2 has no causal relationship with the 2nd event (B:3,c:2) on node C and belongs to an event at the same time.

Version Vector

Based on the vector clock we can get the order of any two events, the result or sequence or for the simultaneous occurrence, the identification sequence of events in the engineering practice has a very important extension of the application, the most common application is to find data conflicts (detect conflict).

The data in a distributed system typically has multiple replicas (replication), multiple copies may be updated at the same time, resulting in inconsistent data between replicas [the 7],version vector is very similar to the vector clock [8] for data conflict detection [9]. The following example illustrates the use of version vectors [10]:

Figure 3:version Vector

    • The client writes the data, and the request is processed by Sx and the corresponding vector ([Sx, 1]) is created and recorded as data D1
    • The 2nd request is also handled by SX, with data modified to D2,vector modified to ([Sx, 2])
    • 3rd, 4th requests were SY, SZ processing, client end first read to D2, then D3, D4 is written to Sy, SZ
    • 5th update when the client read to D2, D3 and D4 3 data versions, through a vector-like clock to determine the simultaneous relationship of the method can be judged D3, D4 there is a data conflict, finally through a certain method to resolve data conflicts and write D5

Vector clock is only used to find data conflicts, and how to resolve data conflicts is different depending on the scene, the method has the last update to the final (write win), or the conflicting data to the client to decide how to handle the client side, [11] or by quorum resolution to avoid data conflicts beforehand.

Because the logical clock information of all the data is recorded on all nodes, one of the problems that vector clock and version vectors may face in practical applications is that the vector is too large for data management metadata (meta data) to be even larger than the data itself [12].

The problem is solved by using the server ID instead of the client ID to create the vector (because the number of servers is stable relative to the client), or by setting the maximum size, or by eliminating the oldest vector information if the size value is exceeded [10][13].

Summary

The above describes the representation of the logic clock in the distributed system, through the Lamport timestamps can establish the whole order relationship of events, through the vector clock can be compared to any two sequence of events and can represent the event of no causal relationship, the vector The clock method is used to discover the data version conflicts, so there is the versions vector.

[1] Time is an illusion, George Neville-neil, 2016

[2] There is no. now, Justin Sheehy, 2015

[3] Time, clocks, and the ordering of Events in a distributed System, Leslie Lamport, 1978

[4] Timestamps in message-passing Systems, Preserve the Partial ordering, Colin J. Fidge, 1988

[5] Virtual time and Global states of distributed Systems, Friedemann Mattern, 1988

[6] Why Vector clocks is easy, Bryan Fink, 2010

[7] Conflict Management, CouchDB

[8] Version Vectors is not Vector clocks, Carlos Baquero, 2011

[9] Detection of Mutual inconsistency in distributed Systems, IEEE transactions on software Engineering, 1983

[Ten] Dynamo:amazon's highly Available key-value Store, Amazon, 2007

[One] Conflict Resolution, Jeff Darcy, 2010

Why Vector clocks is hard, Justin Sheehy, 2010

[2014] Causality is expensive (and what does about It), Peter Bailis

Theoretical foundations of distributed systems-time, clock, and sequence of events

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.