Java Distributed lock Implementation detailed

Last Update:2017-12-04 Source: Internet

Author: User

Tags zookeeper

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the process of large-scale Web site technology architecture design and business implementation, there are many situations where a distributed lock is required. Then there is the problem, which type of distributed lock is better for our project?

I have done some analysis on this question:

Distribution Lock Status:

At present, almost a lot of large-scale Web sites and applications are distributed deployment, data consistency in distributed scenarios has always been a relatively important topic.

The distributed Cap theory tells us that "none of the distributed systems can meet both consistency (consistency), availability (availability), and partition fault tolerance (Partition tolerance) at most, at the same time." "So many systems will have to make a choice at the beginning of the design. In the vast majority of the Internet scene, the need to sacrifice strong consistency in exchange for the high availability of the system, often only need to ensure "final consistency", as long as the final time is acceptable to the user within the scope.

In many scenarios, in order to ensure the final consistency of data, we need a lot of technical solutions to support, such as distributed transactions, distributed locks and so on. Sometimes, we need to make sure that a method can only be executed by the same thread at the same time. In a stand-alone environment, Java provides many concurrent processing-related APIs, but these APIs are not available in distributed scenarios. This means that simple Java APIs do not provide the ability to distribute locks. So there are many schemes for the implementation of distributed locks at present.

Distributed lock Implementation scheme:

The implementation of distributed locks, currently more commonly used in the following 3 kinds of scenarios:

Implementation of distributed lock based on database
Cache-based (Redis,memcached,tair) implementation of distributed locks
Implementation of distributed locks based on zookeeper

When the actual landing time will choose to implement multiple engines (Zk+redis/tair) to facilitate different business use

Distributed lock Definition:

Distributed locks are a way to control the simultaneous access of shared resources between distributed systems. In distributed systems, it is often necessary to coordinate their actions. If different systems or different hosts of the same system share one or a group of resources, then access to these resources often requires mutual exclusion to prevent mutual interference to ensure consistency, in this case, you need to use a distributed lock.

Thinking of distributed locks :

Before we analyze these implementations, let's think about how the distributed locks we need should be. (In this case, the method lock is the same as the resource lock)

It is guaranteed that the same method can only be executed by one thread on a single machine at a time in a distributed deployment application cluster.

This lock is a re-entry lock (to avoid deadlocks)
This lock is best a blocking lock (consider whether you want to do this according to your business needs)
A high availability lock and release lock feature
Better performance for lock and release locks

distributed lock Implementation :

Based on database

The simple way to do this is to create a lock table that is implemented by manipulating the table's data.

The design of the lock is implemented with the optimistic lock of the database, which can satisfy the concurrency of the basic transaction and the power of the transaction retry. The approximate implementation is to find out whether the lock exists based on the lock field, if present, to determine the lock state, whether or not the lock is successful, and if not, to insert the lock;

Of course, this database-dependent locking flaw is:

1, this lock strongly relies on database availability, the database is a single point, once the database hangs, will cause the business system is not available.

2, this lock does not have the time of failure, and the lock data will continue to grow.

3, this lock can only be non-blocking, because the data insert operation, once inserted failure will be directly error. A thread that does not acquire a lock does not enter the queued queue, and the lock operation is triggered again if the lock is to be acquired again.

4. This lock is non-reentrant, and the same thread cannot obtain the lock again until the lock is released. Because the data already exists in the data.

5, the operation of the database requires a certain amount of overhead, performance issues need to consider

In fact, for the above we have been optimized for 1 and 4:

1. The database from the synchronization

4. The thread number is set to meet the Reentrant

Exclusive lock based on database

In addition to adding or removing records from the data table, you can actually implement a distributed lock with the help of the locks in the data.

Implementing distributed locks based on cache

You can use caching instead of a database to implement distributed locks, which can provide better performance, while many cache services are deployed in a cluster to avoid a single point of problem. And many caching services provide methods that can be used to implement distributed locks, such as Tair's Put method, Redis's Setnx method, and so on. Also, these caching services provide support for automatic deletion of expired data, and you can set the time-out to control the release of the lock directly.

Benefits of implementing distributed locks using caching

Good performance, more convenient to achieve.

Disadvantages of implementing distributed locks using caching

It is not very reliable to control the lock expiration time by time-out.

Implementation of distributed locks based on zookeeper

A distributed lock can be implemented based on zookeeper temporary ordered nodes.

The general idea is that when each client locks a method, it generates a unique instantaneous ordered node in the directory of the specified node corresponding to the method on zookeeper.

The way to determine whether to acquire a lock is simply to judge the smallest of the ordinal nodes in the order.

When releasing the lock, simply delete the instantaneous node. At the same time, it avoids the deadlock problem caused by the failure of the lock due to service outage.

See if zookeeper can solve the problem mentioned earlier.

Lock not released? The use of zookeeper can effectively solve the problem of the lock cannot be released, because when the lock is created, the client creates a temporary node in ZK, and once the client acquires the lock and then suddenly hangs out (the session connection disconnects), the temporary node is automatically deleted. Other clients can get the lock again.
Non-blocking lock? With zookeeper, a blocking lock can be implemented, the client can create a sequential node in ZK, and the node is bound to the listener, once the node changes, zookeeper notifies the client, the client can check that the node you created is not the lowest ordinal of all current nodes, if so, You can then execute the business logic by acquiring the lock yourself.
No re-entry? The use of zookeeper can also effectively solve the problem of non-reentrant, the client in the creation of the node, the current client's host information and thread information directly to the node, the next time you want to acquire a lock and the current smallest node in the data compared to a bit. If the same as your own information, then you get to the lock directly, if not the same, create a temporary sequential node, participate in the queue.
A single point of issue? The use of zookeeper can effectively solve the single point problem, ZK is a cluster deployment, as long as more than half of the cluster of machines survive, you can provide services outside.

The Zookeeper third-party library curator client can be used directly, which encapsulates a reentrant lock service.

Compare three types of distributed locks

Distributed lock ZK, database, and Redis all can be achieved, the same is the distributed lock, the difference between the three?

From the perspective of ease of understanding (low to High): Database > Cache > Zookeeper

From the complexity angle of implementation (low to High): Zookeeper >= Cache > database

From performance perspective (high to Low): Cache > Zookeeper >= database

From a reliability perspective (high to Low): Zookeeper > Cache > Database

This article is only theoretically analysis of the principle of distributed locking and the implementation of the way, I will be 3 ways to implement the code to do a detailed description of the introduction, the shortcomings please more advice!

Java Distributed lock Implementation detailed

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More