Analysis of Distributed Lock implementation in Redis (2), analysis of redis

Source: Internet
Author: User

Analysis of Distributed Lock implementation in Redis (2), analysis of redis

Abstract: In the previous article, we mentioned three popular solutions for implementing distributed locks: database, Redis, and Zookeeper. This article mainly describes the Redis-based distributed locks, distributed architecture design is now widely used in enterprises. When different distributed nodes work collaboratively, the sequence, result correctness, and execution cost of node services have also become an important factor to consider. In this case, the running result is incorrect because different service nodes process the same task at the same time, which consumes unnecessary system resources. What if this problem is solved? You can select distributed locks. This article describes how to implement distributed locks through redis.

Basic application scenarios and design principles of distributed locks

Let's take a look at a simple case: There are three services: orderService, reportService, and pushService ), each service is deployed horizontally on two nodes. The report service pulls order data from the Order Service at every morning and generates reports. At every morning, it sends new data reports to users through the PUSH Service, how do I design this process?

First, we need to understand the two key points of this process. First, the two nodes of the report service can only have one node to generate reports. Otherwise, system resources will be wasted, there is no high-reliability requirement for this key point (duplicate overwriting and generation won't produce incorrect results). Second, only one node can push the data report to the same user for execution, otherwise, the user will receive two identical reports, which have high reliability requirements.

We can extract the same point from two key points. We must set a lock to get the lock node to execute the specified task. At the same time, it can also be extracted to a different point, that is, the two scenarios depend on different degrees of lock acquisition. Let's perform a simple modeling of this process:


 

A simple and reliable lock mechanism can be implemented through the process. Of course, this is a prerequisite.

First, the lock service must be stable enough. If the lock cannot be obtained, the competition task cannot be executed. Second, the process of executing a competitive task cannot be deadlocked or infinite waiting. Otherwise, the lock cannot be released and the task cannot be completed. Therefore, two factors must be taken into account when designing the lock: the lock must have an expiration time and handle exceptions in case of high availability or lock error in the process of obtaining and releasing the lock.

Therefore, we should summarize the following elements that should be taken into account in the design of distributed locks:

Distributed locks must ensure absolute mutex when multiple clients compete for critical resources;

The Distributed Lock should be designed with a certain timeout time to prevent the lock from being released due to service blocking or crash of the lock;

Distributed locks should be designed for business scenarios to degrade abnormally to prevent the failure to obtain critical resources due to incorrect locks.

There are also some things to note about the 2nd elements. Assume that report Service A has A long full gc after obtaining the lock, and the system has paused. During this period, when the lock times out, report service B obtains the lock again and sends the Report to the user. After the client AFull GC ends, it also executes the report sending task, the execution result is incorrect.


 

This type of scenario often requires personalized processing. Most distributed locks in the industry are now in this situation, because the lock failure caused by system suspension is often difficult to avoid, because the system pause may occur at any time. In general, we need to estimate the time to access competing resources, determine the time-out period, and perform data comparison and necessary data compensation after the access is completed.

Implement distributed locks in Redis

In the redis command set, there is a command called SETNX, the specific command format is: SETNX key value

The function of this command is to do nothing if the key exists, and return 0. If the key does not exist, the value of the key is set to value and 1 is returned, this command is atomic. We can use this command to implement distributed locks.

Get lock: Get the current timestamp, and use the Client ID as the key. The timestamp is used as the value to call SETNX, set the lock TTL, and handle the lock acquisition exception.

Confirm the lock status. If the lock is obtained successfully, the critical resource will be accessed. Otherwise, the lock will be obtained again at a certain interval based on the business scenario.

Access critical resources

Release lock

// Obtain the lock

TimeStamp = getCurrentTimeStamp ();

Try {

Lock = SET CLIENT_ID timeStamp nx px timeout;

} Catch (Exception e ){

// Handle the exception of lock acquisition

Return;

}

Try {

If (lock = 0 ){

Return;

} Else {

// Access critical resources

Do ();

}

} Finally {

// Release the lock

Del CLIENT_ID;

}

This distributed lock method is preferred by many developers. But how can we ensure the availability of redis? If we use a redis node, when it goes down due to uncontrollable reasons, the lock mechanism will be unavailable. Some people may say that redis can be used for master-slave cluster replication. If the master fails, it can be replaced. However, it is estimated that the problem still cannot be solved because redis master-slave replication is asynchronous, who can ensure that the master is down and the data on the node must be locked?

Redis official website introduced a red lock algorithm, which discards a single redis node and uses N (5 recommended on the official website) Independent redis nodes as the lock service, to obtain the lock, the client must apply for the lock from the N/2 + 1 (majority) node to access the critical resource.


 

However, the process of obtaining the lock in this algorithm becomes more complicated, and the time becomes more uncontrollable. Assume that the lock is successfully obtained from the redis1 node and from redis (N/2 + 1) the end time of the successful lock acquisition is SPACETIME, and the lock to the effective time is no longer the key to TTL,:

REMAIN_TIME = TTL-SPACETIME

When SPACETIME is large, the client is very likely to obtain a lock that has expired. Therefore, after obtaining the lock, the red lock algorithm needs to verify again whether the lock is invalid.

// Obtain the lock

TimeStamp = getCurrentTimeStamp ();

// Apply for a lock from N/2 + 1 nodes

Int successLockNum = 0;

Boolean lockSuccess = false;

For (int I = 1; I <5; I ++ ){

Try {

Lock = SET CLIENT_ID timeStamp nx px timeout;

If (lock = 1 & ++ successLockNum = N/2 + 1 ){

LockSuccess = true;

Break;

}

} Catch (Exception e ){

// Handle the exception of lock acquisition

Return;

}

}

// Verify whether the lock is obtained successfully

If (! SuccessLockNum ){

// Failed to get the lock

Return;

}

// Verify whether the obtained lock is invalid.

NowTimeStamp = getCurrentTimeStamp ();

If (nowTimeStamp-timeStamp> TTL ){

// Invalid lock

Return;

}

Try {

// Access critical resources

Do ();

} Finally {

// Release the lock

Del CLIENT_ID;

}

Follow-up

Using Redis to implement the distributed lock mechanism is very common in the industry. However, during the application process, we must pay attention to implementing the lock to timeout to avoid deadlocks and lock failure due to service suspension, in each case, the solution needs to be customized. The Red lock algorithm solves the stability problem of the Distributed lock service to a certain extent, but brings about system complexity. At the same time, some people question the algorithm and are interested in searching for it in the search engine. This article is here. If you have any errors, please correct them.

If you want to learn more about distributed knowledge points, add the group: 537775426 (note the information). I will place the distributed knowledge points in the shared area of the group, I will also share some of my many years of experience in the group. I hope my work experience will help you avoid detours on the road to becoming an architect. Build your own technical system and technical cognition comprehensively and scientifically!


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.