Distributed locks are enough.

Last Update:2018-10-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is a lock?
In a single-process system, when multiple threads can change a variable (variable shared variable) at the same time, the variable or code block needs to be synchronized, so that the variable can be linearly modified when it is modified to eliminate concurrent modification of the variable.
The essence of synchronization is implemented through locks. To enable multiple threads to run only one thread in the same code block at a time, you need to make a mark somewhere. This mark must be visible to every thread, when the mark does not exist, you can set the mark. If other threads find that the mark already exists, wait for the threads with the mark to end the synchronization code block to cancel the mark and then try to set the mark. This mark can be understood as a lock.
The locks are implemented in different ways, as long as they can be seen by all threads. For example, in Java, synchronize sets a tag in the object header, the implementation class of the lock interface is basically only a volitile modified int type variable, which ensures that each thread can have visibility and atomic modification to the int, in Linux kernel, memory data such as mutex volume or semaphore is also used for marking.
In addition to the use of memory data for locks, any mutex can be used for locks (only the mutex is considered). For example, the checksum of the serial number and time in the flow table can be considered as a lock that will not be released, or whether a file exists as a lock. You only need to ensure atomicity and memory visibility by modifying tags.

What is distributed?
The distributed cap theory tells us:

No distributed system can satisfy both consistency, availability and partition tolerance.

Currently, many large websites and applications are deployed in a distributed manner. Data Consistency in Distributed scenarios has always been an important topic. Based on the CAP theory, many systems have to make trade-offs on these three systems at the beginning of design. In the vast majority of Internet scenarios, We need to sacrifice strong consistency in exchange for high availability of the system. The system usually only needs to ensure eventual consistency.

Distributed scenarios
In cluster mode, multiple identical services are enabled simultaneously.

In many scenarios, many technical solutions are required to ensure data consistency, such as distributed transactions and distributed locks. In many cases, we need to ensure that a method can only be executed by the same thread at the same time. In a standalone environment, we can solve this problem through the concurrent API provided by Java, but in a distributed environment, it is not that simple.

The biggest difference between distributed systems and standalone systems is that they are not multithreading but multi-process.
Because multithreading can share heap memory, memory can be simply used as the marker storage location. Processes may not even be on the same physical machine. Therefore, you need to store tags in a place that can be seen by all processes.

What is a distributed lock?
In a distributed model, there is only one copy of data (or a limit). In this case, you need to use the lock technology to control the number of processes that modify data at a certain time point.
In standalone mode, the locks must not only ensure that the process is visible, but also consider the network problem between the Process and the lock. (In my opinion, in Distributed situations, the problem becomes complicated, mainly because of network latency and reliability... A big pitfall)
The distributed lock can still mark the memory, but the memory is not the memory allocated by a process, but the public memory such as redis and memcache. As for the use of databases, files, and other locks, the single-host implementation is the same, as long as the mark can be mutually exclusive.
What distributed locks do we need?
In a distributed application cluster, the same method can only be executed by one thread on one machine at a time.
This lock should be reentrant (to avoid deadlock)
The lock should be a blocking lock (depending on business needs)
This lock should be a fair lock (based on business needs to consider whether or not to use this lock)
High Availability of lock acquisition and lock release
Better performance in obtaining and Releasing locks
Database-based distributed locks
Optimistic lock

Distributed Lock Based on table primary key uniqueness
Using the unique primary key feature, if multiple requests are submitted to the database at the same time, the database will ensure that only one operation can be successful, then we can think that the thread with the successful operation has obtained the Lock of this method. After the method is executed, if you want to release the lock, delete this database record.

The preceding simple implementation has the following problems:

This lock is strongly dependent on the availability of the database. The database is a single point. Once the database fails, the business system will become unavailable.
This lock has no expiration time. Once the unlock operation fails, the lock record will remain in the database, and other threads will no longer be able to get the lock.
This lock can only be non-blocking, because the insert operation of data directly reports an error once the insertion fails. Threads that do not obtain the lock will not enter the queue. To obtain the lock again, the lock acquisition operation will be triggered again.
The lock is non-reentrant. The same thread cannot obtain the lock again before it is released. Because the data already exists.
This lock is a non-fair lock, and all threads waiting for the lock compete for the lock with luck.
Using primary key conflict prevention in MySQL Databases may cause lock performance in the case of high concurrency.
Of course, we can also solve the above problems in other ways.

Is the database a single point of failure? Two databases are involved, and data is synchronized in two directions. Once the data fails, it is quickly switched to the slave database.
No expiration time? You only need to perform a scheduled task to clear the timeout data in the database at intervals.
Non-blocking? Create a while loop until the insert operation is successful and then return success.
Non-reentrant? Add a field to the database table to record the host information and thread information of the machine on which the lock is currently obtained. Then, query the database next time you obtain the lock, if the host information and thread information of the current machine can be found in the database, you can directly allocate the lock to him.
Unfair? Create an intermediate table, record all threads waiting for the lock, and sort the threads according to the creation time. Only the first created threads are allowed to obtain the lock.
A better solution is to generate primary keys in the program to prevent duplicates.
Distributed Lock Based on table field version number
This policy is derived from the MVCC mechanism of MySQL. The only problem with using this policy is that the data table has a large intrusion. We need to design a version number field for each table, then, write a SQL statement to judge each time, increasing the number of database operations. the overhead of database connection is intolerable under the high concurrency requirement.

Pessimistic lock

Distributed locks based on database exclusive locks
After the query statement is added, the database will add an exclusive lock to the database table during the Query Process (NOTE: When the InnoDB Engine locks, row-level locks are used only when indexes are used for retrieval. Otherwise, table-level locks are used. To use row-level locks, we need to add an index to the field name of the method to be executed. It is worth noting that this index must be created as a unique index, otherwise, multiple overloaded methods cannot be accessed at the same time. If you want to overload the method, we recommend that you add the parameter type .). When an exclusive lock is applied to a record, other threads cannot add an exclusive lock to the record.

We can think that the thread that obtains the exclusive lock can obtain the distributed lock. After obtaining the lock, we can execute the business logic of the method. After executing the method, we can use the connection. the COMMIT () operation to release the lock.

This method can effectively solve the problems mentioned above that the lock cannot be released or blocked.

Blocking lock? The for update statement is returned immediately after the execution is successful, and is blocked until the execution is successful.
After the lock, the service goes down and cannot be released? In this way, the database will release the lock after the service is down.
However, it still cannot directly solve the single point of failure and reentrant problem of the database.

There may be another problem here, although we use a unique index for the method field name, and it is displayed that we use for update to use row-level locks. However, MySQL will optimize the query. Even if the index field is used in the condition, whether to use the index to retrieve data is determined by MySQL by determining the cost of different execution plans, if MySQL considers that full table scan is more efficient, for example, for some small tables, it will not use indexes. In this case, InnoDB will use table locks instead of row locks. If this happens, it will be a tragedy...

Another problem is that we need to use the exclusive lock to lock distributed locks. If an exclusive lock is not submitted for a long time, it will occupy the database connection. Once a similar connection is changed, the database connection pool may burst.

Advantages and disadvantages
Advantages: simple and easy to understand

Disadvantages: There will be various problems (database operation requires a certain amount of overhead, the use of database row-level locks is not necessarily reliable, performance is not reliable)

Distributed locks Based on redis
Distributed locks Based on the setnx () and expire () Methods of redis
Setnx ()
Setnx is set if not exists. It mainly has two parameters: setnx (Key, value ). This method is atomic. If the key does not exist, the current key is successfully set and 1 is returned. If the current key already exists, the current key fails to be set and 0 is returned.

Expire ()
Expire sets the expiration time. Note that the setnx command cannot set the key timeout time. You can only set the key through expire.

Procedure
1. setnx (lockkey, 1) If 0 is returned, the placeholder fails. If 1 is returned, the placeholder is successful.

2. The expire () command sets the lockkey timeout time to avoid deadlock.

3. After executing the business code, you can use the DELETE command to delete the key.

This solution can solve the needs of daily work, but there may be some improvements from the discussion of technical solutions. For example, if the first step of setnx is successfully executed and the expire () command is successfully executed before it goes down, the deadlock still occurs, so if you want to improve it, you can use the setnx (), get (), and GetSet () Methods of redis to implement distributed locks.

Implement distributed locks Based on the setnx (), get (), and GetSet () Methods of redis.
The background of this solution is to optimize the setnx () and expire () solutions for possible deadlocks.

GetSet ()
This command mainly includes two parameters: GetSet (Key, newvalue ). This method is atomic. It sets newvalue for the key and returns the old value of the key. If the key does not exist, the following effect is displayed when you execute this command multiple times:

GetSet (key, "value1") returns NULL. The key value is set to value1.
GetSet (key, "value2") returns value1. At this time, the key value is set to value2.
And so on!
Procedure
Setnx (lockkey, current time + expiration time). If 1 is returned, the lock is obtained successfully. If 0 is returned, the lock is not obtained and switched to 2.
Get (lockkey) gets the value oldexpiretime and compares the value with the current system time. If it is earlier than the current system time, the lock has timed out, other requests can be allowed to be retrieved again and switched to 3.
Calculate newexpiretime = Current Time + expiration time, and then GetSet (lockkey, newexpiretime) will return the current lockkey value currentexpiretime.
Determine whether the currentexpiretime and oldexpiretime are equal. If they are equal, the current GetSet is set successfully and the lock is obtained. If they are not equal, the lock is obtained by another request, so the current request can directly return a failure or continue to retry.
After obtaining the lock, the current thread can start its own business processing. After the processing is completed, compare its processing time and the timeout time set for the lock, if the value is less than the lock timeout, delete is executed to release the lock. If the value is greater than the lock timeout, no locks are needed.
Import cn.com. tpig. cache. redis. redisservice;
Import cn.com. tpig. utils. springutils;

// Redis Distributed Lock
Public final class redislockutil {

Private Static final int defaultexpire = 60; private redislockutil () {//}/*** lock * @ Param key redis key * @ Param expire expiration time, unit: seconds * @ return true: Lock successful, false, lock failed */public static Boolean lock (string key, int expire) {redisservice = springutils. getbean (redisservice. class); long STATUS = redisservice. setnx (key, "1"); If (status = 1) {redisservice. expire (Key, expire); Return true;} return false;} public static Boolean lock (string key) {return lock2 (Key, defaultexpire );} /*** unlock * @ Param key redis key * @ Param expire expiration time, in seconds * @ return true: Lock successful, false, failed to lock */public static Boolean lock2 (string key, int expire) {redisservice = springutils. getbean (redisservice. class); long value = system. currenttimemillis () + expire; long STATUS = redisservice. setnx (Key, String. valueof (value); If (status = 1) {return true;} Long oldexpiretime = long. parselong (redisservice. get (key, "0"); If (oldexpiretime & lt; system. currenttimemillis () {// timeout long newexpiretime = system. currenttimemillis () + expire; long currentexpiretime = long. parselong (redisservice. getSet (Key, String. valueof (newexpiretime); If (currentexpiretime = oldexpiretime) {return true ;}} return false;} public static void unlock1 (string key) {redisservice = springutils. getbean (redisservice. class); redisservice. del (key);} public static void unlock2 (string key) {redisservice = springutils. getbean (redisservice. class); long oldexpiretime = long. parselong (redisservice. get (key, "0"); If (oldexpiretime & gt; system. currenttimemillis () {redisservice. del (key );}

}
}

Public void drawredpacket (long userid ){
String key = "draw. redpacket. userid:" + userid;

Boolean lock = redislockutil. lock2 (Key, 60); If (LOCK) {try {// collection operation} finally {// release lock redislockutil. unlock (key) ;}} else {New runtimeexception ("repeated rewards ");}

}

Distributed locks Based on redlock
Redlock is the redis distributed lock in cluster mode provided by redis author antirez. It is based on N completely independent redis nodes (generally N can be set to 5 ).

The algorithm steps are as follows:

1. The client obtains the current time, in milliseconds.
2. The client tries to obtain the locks of N nodes (each node obtains the same lock as the cache lock mentioned above). N nodes obtain the locks with the same key and value. The client needs to set the interface access timeout. The interface timeout time must be much smaller than the lock timeout time. For example, if the lock is automatically released for 10 s, the interface timeout is set to 5-50 ms. In this way, after a redis node goes down, the access to the node can time out as soon as possible, and the lock can be reduced for normal use.
3. The client calculates the time spent in obtaining the lock. The method is to use the current time minus the time obtained in step 1. Only the client obtains the Lock of more than three nodes, in addition, the client obtains the distributed lock only when the lock acquisition time is less than the lock timeout time.
4. The lock time obtained by the client is the set lock timeout time minus the time spent on obtaining the lock calculated in step 3.
5. If the client fails to obtain the lock, the client deletes all the locks in sequence.
Using the redlock algorithm, you can ensure that the Distributed Lock service can still work when a maximum of two nodes are suspended, which greatly improves the availability compared with the previous database locks and cache locks, because of redis's efficient performance, the distributed cache lock performance is no worse than the database lock.

Advantages and disadvantages

Advantages:

High Performance

Disadvantages:

How long should I set the expiration time? How can I set the expiration time to be too short? If the method is not completed, the lock will be automatically released, and a concurrency problem will occur. If the set time is too long, other threads that obtain the lock may have to wait for a period of time.

Distributed locks Based on redisson
Redisson is the official distributed lock component of redis. GitHub address: https://github.com/redisson/redisson

The above question --> How long should I set the expiration time? This problem occurs in redisson's practice: each time a lock is obtained, only a short timeout time is set, and a thread starts to refresh the lock timeout time every time it is about to reach the timeout time. End the thread while releasing the lock.

Distributed Lock Based on zookeeper
Basic knowledge about zookeeper locks
ZK is generally composed of multiple nodes (singular) and adopts the Zab consistency protocol. Therefore, ZK can be viewed as a single point structure, and all the node data is automatically modified internally to provide the query service.
ZK data is in the form of a directory tree. Each directory is called znode. znode can store data (generally no more than 1 MB) and can also add subnodes.
There are three types of subnodes. Serialization node. Each time a node is added under the node, auto-increment is automatically added to the node name. Temporary node. Once the client that creates this znode loses contact with the server, this znode will also be automatically deleted. The last step is a common node.
Watch mechanism, the client can monitor the changes of each node, and when there is a change, it will generate an event for the client.
ZK basic lock
Principle: Use the temporary node and watch mechanism. Each lock occupies a common node/lock. To obtain the lock, create a temporary node in the/lock directory. If the lock is created successfully, the lock is obtained successfully. If the lock fails, watch/lock the node, after a delete operation is performed, the lock will be applied again. The advantage of a temporary node is that when the process fails, the nodes that can be automatically locked are automatically deleted to cancel the lock.
Disadvantage: all failed lock retrieval processes listen to the parent node, which is prone to a herd effect, that is, when the lock is released, all wait for the process to create nodes together, a large amount of concurrency.
ZK lock Optimization
Principle: The locks are changed to creating temporary ordered nodes. Each locked node can be successfully created, but its serial number is different. Only the smallest sequence number can have the lock. If the node number is not the smallest, the watch sequence number is smaller than the previous node (fair lock ).
Steps:

Create an ordered temporary node (ephemeral_sequential) under the/lock node ).
Judge whether the number of the created node is the smallest. If the number is the smallest, the lock is obtained successfully. Otherwise, the lock fails, and the watch sequence number is smaller than the previous node.
When the lock acquisition fails, after the watch is set, wait for the watch event to arrive and judge again whether the sequence number is the minimum.
If the lock is successfully obtained, the code is executed and the lock is released (delete the node ).

Import java. Io. ioexception;
Import java. util. arraylist;
Import java. util. collections;
Import java. util. List;
Import java. util. Concurrent. countdownlatch;
Import java. util. Concurrent. timeunit;
Import java. util. Concurrent. locks. condition;
Import java. util. Concurrent. locks. lock;

Import org. Apache. zookeeper. createmode;
Import org. Apache. zookeeper. keeperexception;
Import org. Apache. zookeeper. watchedevent;
Import org. Apache. zookeeper. watcher;
Import org. Apache. zookeeper. zoodefs;
Import org. Apache. zookeeper. zookeeper;
Import org. Apache. zookeeper. Data. Stat;

Public class distributedlock implements lock, Watcher {
Private zookeeper ZK;
Private string root = "/locks"; // Root
Private string lockname; // mark of competing resources
Private string waitnode; // wait for the previous lock
Private string myznode; // The current lock
Private countdownlatch latch; // counter
Private int sessiontimeouts = 30000;
Private list <exception> exception = new arraylist <exception> ();

/*** Create a distributed lock. Make sure that the zookeeper service configured in config is available * @ Param config 127.0.0.1: 2181 * @ Param lockname, lockname cannot contain the word lock */Public distributedlock (string config, string lockname) {This. lockname = lockname; // create a connection to the server. Try {zk = new Zookeeper (config, sessiontimeout, this); stat = zk. exists (root, false); If (STAT = NULL) {// create the root node zk. create (root, new byte [0], zoodefs. IDs. open_acl_unsafe, createmo De. persistent) ;}} catch (ioexception e) {exception. add (E);} catch (keeperexception e) {exception. add (E);} catch (interruptedexception e) {exception. add (e) ;}}/*** monitor of the zookeeper node */Public void process (watchedevent event) {If (this. latch! = NULL) {This. latch. countdown () ;}} public void lock () {If (exception. size () & gt; 0) {Throw new lockexception (exception. get (0);} Try {If (this. trylock () {system. out. println ("Thread" + thread. currentthread (). GETID () + "" + myznode + "get lock true"); return;} else {waitforlock (waitnode, sessiontimeout); // wait for Lock} catch (keeperexception E) {Throw new lockexception (E);} catch (interruptedexcept Ion e) {Throw new lockexception (e) ;}} public Boolean trylock () {try {string splitstr = "_ Lock _"; if (lockname. contains (splitstr) throw new lockexception ("lockname can not contains \ u000b"); // create a temporary subnode myznode = zk. create (root + "/" + lockname + splitstr, new byte [0], zoodefs. IDs. open_acl_unsafe, createmode. ephemeral_sequential); system. out. println (myznode + "is created"); // retrieve the list of all subnodes & lt; Str Ing & gt; subnodes = zk. getchildren (root, false); // retrieve the locklist of all locknames & lt; string & gt; lockobjnodes = new arraylist & lt; string & gt ;(); for (string node: subnodes) {string _ node = node. split (splitstr) [0]; If (_ node. equals (lockname) {lockobjnodes. add (node) ;}} collections. sort (lockobjnodes); system. out. println (myznode + "=" + lockobjnodes. get (0); If (myznode. equals (root + "/" + lockobjnodes. get (0) {// such If it is the smallest node, return true is obtained;} // if it is not the smallest node, find the node string submyznode = myznode that is 1 smaller than itself. substring (myznode. lastindexof ("/") + 1); waitnode = lockobjnodes. get (collections. binarysearch (lockobjnodes, submyznode)-1);} catch (keeperexception e) {Throw new lockexception (E);} catch (interruptedexception e) {Throw new lockexception (E );} return false;} public Boolean trylock (long time, timeunit unit) {try {If (this. trylock () {return true;} return waitforlock (waitnode, time);} catch (exception e) {e. printstacktrace ();} return false;} private Boolean waitforlock (string lower, long waittime) throws interruptedexception, keeperexception {stat = zk. exists (root + "/" + lower, true); // determines whether a node with a smaller number exists than itself. If it does not exist, you do not need to wait for the lock, and register the listener if (stat! = NULL) {system. out. println ("Thread" + thread. currentthread (). GETID () + "waiting for" + root + "/" + lower); this. latch = new countdownlatch (1); this. latch. await (waittime, timeunit. milliseconds); this. latch = NULL;} return true;} public void unlock () {try {system. out. println ("unlock" + myznode); zk. delete (myznode,-1); myznode = NULL; zk. close ();} catch (interruptedexception e) {e. printstacktrace ();} catch (keeperexception e) {e. printstacktrace () ;}} public void lockinterruptibly () throws interruptedexception {This. lock ();} public condition newcondition () {return NULL;} public class lockexception extends runtimeexception {Private Static final long serialversionuid = 1l; Public lockexception (string E) {super (e) ;}public lockexception (exception e) {super (e );}}

}

Distributed locks are enough.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Distributed locks are enough.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Distributed locks are enough.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support