Describe the principles and three implementation methods of distributed locks in detail, and describe the principles of locks.

Last Update:2017-10-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Currently, almost all large websites and applications are deployed in a distributed manner. Data Consistency in Distributed scenarios has always been an important topic. The distributed CAP theory tells us that "No distributed system can meet both Consistency, Availability, and Partition tolerance ." Therefore, many systems have to make trade-offs on these three systems at the beginning of design. In the vast majority of Internet scenarios, We need to sacrifice strong consistency in exchange for high availability of the system. The system often only needs to ensure "eventual consistency ", as long as the final time is within the user's acceptable range.

In many scenarios, many technical solutions are required to ensure data consistency, such as distributed transactions and distributed locks. Sometimes, we need to ensure that a method can only be executed by the same thread at the same time. In a standalone environment, Java actually provides many APIs related to concurrent processing, but these APIs are powerless in Distributed scenarios. That is to say, Java APIs alone cannot provide distributed locks. Therefore, there are multiple solutions for implementing distributed locks.

The following solutions are commonly used to implement distributed locks:

Implement distributed locks Based on databases
Distributed locks Based on redis, memcached, and tair
Implement distributed locks Based on Zookeeper

Before analyzing these Implementation Solutions, let's first think about what the distributed locks we need? (Here we use the method lock as an example. The same is true for the resource lock)

In a distributed application cluster, the same method can only be executed by one thread on one machine at a time.
This lock should be reentrant (to avoid deadlock)
The lock should be a blocking lock (depending on business needs)
High Availability of lock acquisition and lock release
Better performance in obtaining and Releasing locks

Implement distributed locks Based on databases

Based on database tables

To implement distributed locks, the simplest way is to directly create a lock table and then perform operations on the data in the table.
When we want to lock a method or resource, we add a record to the table and delete the record when we want to release the lock.

Create a database table as follows:

Create table 'methodlock' ('id' int (11) not null AUTO_INCREMENT COMMENT 'Primary key', 'Method _ name' varchar (64) not null default ''comment' lock method name', 'desc' varchar (1024) not null default 'note ', 'Update _ time' timestamp not null default CURRENT_TIMESTAMP ON update CURRENT_TIMESTAMP COMMENT 'save data time, automatically generated', primary key ('id '), unique key 'uidx _ method_name '('method _ name') using btree) ENGINE = InnoDB default charset = utf8 COMMENT = 'method in lock ';

To lock a method, run the following SQL statement:

insert into methodLock(method_name,desc) values (‘method_name',‘desc')

Because we have a uniqueness constraint on method_name, if multiple requests are submitted to the database at the same time, the database will ensure that only one operation can be successful, then we can think that the thread with the successful operation has obtained the lock of the method and can execute the content of the method body.

After the method is executed, You need to execute the following SQL statement to release the lock:

delete from methodLock where method_name ='method_name'

The preceding simple implementation has the following problems:

1. This lock is highly dependent on the availability of the database. The database is a single point of failure. Once the database fails, the business system may become unavailable.
2. The lock has no expiration time. Once the unlock operation fails, the lock record will remain in the database, and other threads will no longer be able to get the lock.
3. The lock can only be non-blocking, because the insert operation of data directly reports an error once the insertion fails. Threads that do not obtain the lock will not enter the queue. To obtain the lock again, the lock acquisition operation will be triggered again.
4. The lock is non-reentrant. The same thread cannot obtain the lock again before it is released. Because the data already exists.

Of course, we can also solve the above problems in other ways.

Is the database a single point of failure? Implement two databases and synchronize data in two directions. Once the slave database fails, it is quickly switched to the slave database.
No expiration time? You only need to perform a scheduled task to clear the timeout data in the database at intervals.
Non-blocking? Create a while loop until the insert operation is successful and then return success.
Non-reentrant? Add a field to the database table to record the host information and thread information of the machine on which the lock is currently obtained. Then, query the database next time you obtain the lock, if the host information and thread information of the current machine can be found in the database, you can directly allocate the lock to him.

Database exclusive lock

In addition to adding or deleting records in a data table, you can also implement distributed locks by using the built-in locks in the data.

We also use the database table we just created. Distributed locks can be implemented through exclusive locks of databases. MySql-based InnoDB engine can be locked using the following methods:

public Boolean lock(){connection.setAutoCommit(false)  while(true){try{result = select * from methodLock where method_name=xxx for update;if(result==null){return true;}}catch(Exception e){}sleep(1000);}return false;}

Add for update after the query statement. The database will add an exclusive lock to the database table during the query process. (here we will add another sentence. When InnoDB Engine locks, row-level locks are used only when indexes are used for retrieval. Otherwise, table-level locks are used. To use row-level locks, we need to add an index to method_name. It is worth noting that this index must be created as a unique index, otherwise, multiple overloaded methods cannot be accessed at the same time. If you want to overload the method, we recommend that you add the parameter type .). When an exclusive lock is applied to a record, other threads cannot add an exclusive lock to the record.

We can think that the thread that obtains the exclusive lock can obtain the distributed lock. After obtaining the lock, we can execute the business logic of the method. After executing the method, we can unlock it using the following methods:

public void unlock(){  connection.commit();}

Use the connection. commit () operation to release the lock.

This method can effectively solve the problems mentioned above that the lock cannot be released or blocked.

Blocking lock? The for update statement is returned immediately after the execution is successful, and is blocked until the execution is successful.
After the lock, the service goes down and cannot be released? In this way, the database will release the lock after the service is down.

However, it still cannot directly solve the single point of failure and reentrant problem of the database.

There may be another problem here, although we use a unique index for method_name and show that row-level locks are used for update. However, MySql will optimize the query. Even if the index field is used in the condition, whether to use the index to retrieve data is determined by MySQL by determining the cost of different execution plans, if MySQL considers that full table scan is more efficient, for example, for some small tables, it will not use indexes. In this case, InnoDB will use table locks instead of row locks. If this happens, it will be a tragedy...

Another problem is that we need to use the exclusive lock to lock distributed locks. If an exclusive lock is not submitted for a long time, it will occupy the database connection. Once a similar connection is changed, the database connection pool may burst.

Summary

To sum up, we can use a database to implement distributed locks. Both of these methods depend on a database table. One is to determine whether a lock exists through the existence of records in the table, another method is to implement distributed locks through the exclusive locks of the database.

Advantages of implementing distributed locks in Databases

With the help of databases, it is easy to understand.

Disadvantages of implementing distributed locks in Databases

There will be a variety of problems, which will make the entire solution more and more complex in the process of solving the problem.
Database operations require certain overhead and performance issues need to be considered.
The use of database row-level locks is not always reliable, especially when our lock table is not large.

Implement distributed locks Based on Cache

Compared with the database-based Distributed Lock solution, implementing performance based on cache is better. In addition, many caches can be deployed in clusters to solve single point of failure.

There are many mature cache products, including Redis, memcached, and Tair in our company.

Tair is used as an example to analyze how to implement distributed locks using cache. There are many articles about Redis and memcached on the Internet, and some mature frameworks and algorithms can be used directly.

The implementation of distributed locks Based on Tair is similar to that of Redis. The main implementation method is to use the TairManager. put method.

public boolean trylock(String key) {  ResultCode code = ldbTairManager.put(NAMESPACE, key, "This is a Lock.", 2, 0);  if (ResultCode.SUCCESS.equals(code))    return true;  else    return false;}public boolean unlock(String key) {  ldbTairManager.invalid(NAMESPACE, key);}

The above implementation methods also have several problems:

1. The lock has no expiration time. Once the unlock operation fails, the lock record will remain in the tair, and other threads will no longer be able to get the lock.
2. The lock can only be non-blocking and will be directly returned whether it succeeds or fails.
3. The lock is non-reentrant. After a thread acquires the lock, it cannot be obtained again before the lock is released, because the key used already exists in the tair. You cannot perform the put operation again.

Of course, there are also solutions.

No expiration time? Tair's put method supports passing in the expiration time, and the data is automatically deleted after the arrival time.
Non-blocking? While.
Non-reentrant? After a thread obtains the lock, it saves the current host information and thread information. before obtaining the lock, it checks whether it is the owner of the current lock.

However, how long should I set the expiration time? How can I set the expiration time to be too short? If the method is not completed, the lock will be automatically released, and a concurrency problem will occur. If the set time is too long, other threads that obtain the lock may have to wait for a period of time. This problem also exists when the database is used to implement distributed locks.

Summary

You can use the cache instead of the database to implement distributed locks, which can provide better performance. At the same time, many cache services are deployed in clusters to avoid single point of failure. In addition, many cache services provide methods that can be used to implement distributed locks, such as the put Method of Tair and the setnx method of redis. In addition, these cache services also provide support for automatic deletion of expired data. You can directly set the timeout time to control lock release.

Advantages of Using Cache to implement distributed locks

Good performance and convenient implementation.

Disadvantages of Using Cache to implement distributed locks

It is not very reliable to control the lock expiration time through the timeout time.

Implement distributed locks Based on Zookeeper

Distributed locks that can be implemented based on zookeeper temporary ordered nodes.

The general idea is that when each client locks a method, a unique instantaneous ordered node is generated under the directory of the specified node corresponding to the method on zookeeper. It is easy to determine whether to obtain the lock. You only need to determine the minimum serial number of an ordered node. When the lock is released, you only need to delete the instantaneous node. At the same time, it can prevent the lock from being released due to service downtime, resulting in deadlocks.

Let's see if Zookeeper can solve the problem mentioned above.

Lock cannot be released? Using Zookeeper can effectively solve the issue that the lock cannot be released, because when the lock is created, the client will create a temporary node in ZK, once the client gets the lock and suddenly fails (the Session connection is disconnected), the temporary node is automatically deleted. Other clients can obtain the lock again.
Non-blocking lock? Zookeeper can be used to implement blocking locks. The client can create an ordered node in ZK and bind the listener to the node. Once the node changes, Zookeeper will notify the client, the client can check whether the node created by the client is the smallest of all nodes. If yes, the client can obtain the lock and execute the business logic.

Cannot be reentrant? Zookeeper can also effectively solve the problem of non-reentrant. When the client creates a node, it directly writes the host information and thread information of the current client to the node, the next time you want to obtain the lock, you can compare it with the data in the current smallest node. If it is the same as your own information, you can directly obtain the lock. If it is different, you can create a temporary sequence node to participate in the queue.

Single point of failure? Zookeeper can effectively solve single point of failure. ZK is deployed in a cluster. As long as more than half of the machines in the cluster are alive, Zookeeper can provide external services.

You can directly use the zookeeper third-party library Curator client, which encapsulates a reentrant lock service.

public boolean tryLock(long timeout, TimeUnit unit) throws InterruptedException {  try {    return interProcessMutex.acquire(timeout, unit);  } catch (Exception e) {    e.printStackTrace();  }  return true;}public boolean unlock() {  try {    interProcessMutex.release();  } catch (Throwable e) {    log.error(e.getMessage(), e);  } finally {    executorService.schedule(new Cleaner(client, path), delayTimeForClean, TimeUnit.MILLISECONDS);  }  return true;}

The InterProcessMutex provided by Curator is the implementation of distributed locks. The acquire method is used to obtain the lock, and the release method is used to release the lock.

The distributed locks implemented using ZK seem to fully comply with all our expectations for a distributed lock at the beginning of this article. However, it is not actually because the Distributed Lock implemented by Zookeeper has a disadvantage that the performance may not be as high as the cache service. Because each time the lock is created and released, the instantaneous node must be dynamically created and destroyed to implement the lock function. Nodes created and deleted in ZK can only be executed through the Leader server, and data cannot be synchronized with all Follower machines.

In fact, using Zookeeper may also lead to concurrency problems, but it is not common. In this case, due to network jitters, the session connection of the ZK cluster on the client is disconnected. Therefore, zk will delete the temporary node when the client fails, at this time, other clients can obtain the distributed lock. Concurrency problems may occur. This problem is not common because zk has a Retry Mechanism. Once the zk cluster cannot detect the heartbeat of the client, it will retry. The Curator client supports multiple retry policies. The temporary node will be deleted only after multiple retries. (Therefore, it is also important to select a suitable retry policy. A balance should be established between the lock granularity and concurrency .)

Summary

Advantages of using Zookeeper to implement distributed locks

Effectively solves single point of failure, non-reentrant issues, non-blocking issues, and the issue that the lock cannot be released. It is easy to implement.

Disadvantages of using Zookeeper to implement distributed locks

In terms of performance, it is better to use the cache to implement distributed locks. You need to understand the principles of ZK.

Comparison of the three solutions

None of the above methods can be perfect. Just like CAP, it cannot be met in terms of complexity, reliability, and performance. Therefore, it is king to choose the one that best suits you according to different application scenarios.

From the perspective of understanding difficulty (from low to high)
Database> cache> Zookeeper

From the perspective of implementation complexity (from low to high)
Zookeeper> = cache> Database

From the performance perspective (from high to low)
Cache> Zookeeper> = Database

From the reliability perspective (from high to low)
Zookeeper> cache> Database

Summary

The above is the full description of the Distributed Lock principle and three implementation methods in this article. If you are interested, refer: I want to explain the mutex lock semaphores and multi-thread Wait Mechanism in java, detailed examples of using apache zookeeper, several important MySQL variables, and other related topics on this site, and hope to help you. If you have any questions, please leave a message at any time. The editor will reply in time to provide you with a better reading experience and help. Thank you for your support for this site!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Describe the principles and three implementation methods of distributed locks in detail, and describe the principles of locks.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Describe the principles and three implementation methods of distributed locks in detail, and describe the principles of locks.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support