Study on lease mechanism of distributed system theory

Source: Internet
Author: User
Tags nameserver

I. Introduction to the lease mechanism

In distributed systems, there is often a central server node. This node is responsible for storing and maintaining metadata in the system. If the various operations in the system depend on the metadata on the central server, then the central server can easily become a performance bottleneck and a single point of failure. and through the lease mechanism, the central server's "power" can be decentralized to other machines, you can reduce the pressure on the central server. Of course, there are many other uses of the lease mechanism: for example, to determine the state of nodes in a cluster, and to implement read and write locks under distributed ...

For example, GFS master issues a lease to a chunk server, making it a primary replica that, when there are multiple client concurrent update blocks, determines the order in which the data block is updated concurrently by the primary replica. GFS Master will be delegated authority to chunk server, easing some pressure.

Of course, the central server (the metadata server) can also be designed as a cluster form. This also avoids the problem of central server becoming a bottleneck. For example, the messaging server: ROCKETMQ. Both producers and consumers need nameserver to determine the subscription relationship, nameserver into a stateless cluster, where the lease mechanism is not required.

In the case of leases, there is the issue of lease parties and receiving leases. The contents of the lease provisions can be varied (which is why the leases have various application scenarios). The publisher of the lease is generally the central server mentioned above, and the central server guarantees that the content specified in the lease is kept intact within the validity period of the lease.

For example, a distributed cache system, a metadata server (hub server), publishes leases to each client, promising not to change the metadata during the lease's validity period. Thus, each client can read the cached metadata directly from the local, as long as it checks that its lease has not expired, rather than accessing the metadata server every time.

Second, the lease mechanism analysis

① lease mechanism guarantees cache consistency

When the server issues lease, it will ensure that the data is not changed during the lease validity period. In this way, the client receiving lease can safely use the data within the validity period. During this period of validity, the data cached by the Client is consistent with the data on the server.

Problems that exist:

1) When the server modifies the metadata, it needs to block all read requests, and the server cannot issue a new lease. To prevent the newly issued lease data from being inconsistent with the data that the server has just modified.

WORKAROUND: When a read request arrives, the data is returned directly without issuing the lease

2) The server waits until all the client's lease have expired before issuing the new " modified" lease. Therefore, the data on the server is modified, a new version of lease is generated, and the new lease version can be distributed to the client until all old lease on the client expire.

Workaround: The server proactively notifies the persistent lease client to discard the current lease and requests a new lease

② lease mechanism can well accommodate network error anomalies

1) The lease issuance process relies on one-way network communication only

After the server issues the lease, even if the client is not received (client down, network exception), the server just waits until the lease timeout, can guarantee the client no longer caches the data, thus can safely modify the data without destroying the cache consistency.

2) Once lease is received by the client, subsequent lease mechanisms are no longer dependent on network communication.

3) Good fault tolerance for downtime nodes

The lease node is down, and the issuer of the outage cannot change the lease agreement that has been issued, without affecting the correctness of the lease.

With lease node downtime, the issuer does not need to do fault-tolerant processing, only need to wait for lease to expire, you can take back the commitment to the next process.

③ lease mechanism to determine the state of a node

In the network, how to determine the state of a node? Due to the existence of network failure (network differentiation), the use of "heartbeat" mechanism to determine the state of the node will have some shortcomings.

For example, A, B, c three nodes are copies of each other, a for primary,q is responsible for judging the status of a, B, C. If a is working, but the network anomaly between A and Q, Q will also think that a has a problem, so Q re-select B as primary, here will lead to "double-master" problem.

The essence here is: Q thinks a is abnormal, but a does not consider himself abnormal. That is, because the network differentiation causes the system to the "node state" cognition inconsistency.

There are two solutions: 1) You can use the consultation to determine who is primary (Paxos algorithm), which is a de-centralized protocol

2) Adopt lease mechanism

Q After receiving A, B, C heart beat, give them a lease, indicating that they have been aware of their status, so that A, B, C can work properly within the validity period. At the same time, Q can give a a special lease, indicating that a can work as primary. When you need to switch primary, just wait until the lease of a expires, Q to another node to issue a primary lease.

Third, references

Introduction to the principle of distributed system--Liu Jie

A brief introduction to the lease mechanism

The application of Lease mechanism in distributed system

The concept of distributed system--first consistency protocol, conformance model, Byzantine problem, lease, replica protocol

Study on lease mechanism of distributed system theory

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.