This article is reproduced and collated from the previous article: http://mp.weixin.qq.com/s/JTsJCDuasgIJ0j95K8Ay8w
And Next: Http://mp.weixin.qq.com/s/4CUe7OpM6y1kQRK8TOC_qQ Two blog posts
The discussion of Martin's fencing token was omitted (the previous article), the discussion of the Netizen and the Redis author (the following), and the introduction of chubby (the following article).
To look at these three aspects of the friends can be moved to the original reading.
Finally, quote Martin: Engineering Discussions rarely has one right answer. distributed lock based on single Redis node
First, the Redis client sends the following command to the Redis node in order to acquire the lock :
SET resource_name my_random_value NX PX 30000
Command execution succeeded: successfully acquired the lock, can access the shared resources ;
Command execution failed: failed to acquire lock.
Notice that in the SET command above:
My_random_value is a random string generated by the client, which is guaranteed to be unique for all requests to acquire locks for a long enough period of time for all clients.
NX (not if Exist) indicates that the set succeeds only if the resource_name corresponding key value does not exist. This ensures that only the first requested client can acquire the lock, while the other client cannot obtain the lock until the lock is released.
PX 30000 indicates that the lock has an automatic expiration time of 30 seconds. Of course, here 30 seconds is just one example, the client can choose the appropriate expiration time.
Finally, after the client completes the operation on the shared resource, execute the following Redis Lua script to release the lock :
If Redis.call ("Get", keys[1]) = = Argv[1] then
return Redis.call ("Del", Keys[1])
else
return 0
End
This LUA script will pass the value of the previous my_random_value as argv[1] in execution, passing resource_name as the value of keys[1].
SET resource_name my_random_value NX PX 30000
4 details that are easily ignored in the command: thisThe lock must be set to a valid time(Lock validity
Time). Otherwise, when a client acquires the lock successfully, if it crashes (the problem exists with a single machine), or because a network partition has occurred
Partition) so that it can no longer communicate with the Redis node, it will always hold the lock, and the other client will never get the lock. WhileThe effective Time setting is also a dilemma, if the setting is too short, it is possible for the lock to expire before the client completes access to the shared resource, thereby losing protection, and if the setting is too long, it will cause all other clients to fail to acquire the lock once the lock has been released by the client, so that it does not work for a long period of time. When acquiring a lock, someone implemented it as a two Redis command: Setnx resource_name my_random_value
EXPIRE Resource_name 30 Although the two commands perform the same as a set command in the previous algorithm description, theyIt's not atomic.。
If the client crashes after executing SETNX, then there is no chance of executing expire, causing it to hold the lock.it is necessary to set a random string My_random_value,it guarantees that the lock released by the client must be the one held by itself .。 If the set is not a random string, but a fixed value when the lock is acquired, the following execution sequence may occur:
Client 1 acquires the lock successfully. Client 1 has been blocked for a long time on an operation. The expiration time is up and the lock is automatically released. Client 2 obtains a lock corresponding to the same resource. Client 1 recovers from the blocking and releases the lock held by the client 2. The operation to release the lock must be implemented using LUA scripting. The release lock actually contains three steps: ' GET ', Judgment and ' DEL ',using LUA to ensure the atomicity of these three steps。 Otherwise, if you put the three-step operation into the client logic, it is possible to have an execution sequence similar to the previous third problem:
Client 1 acquires the lock successfully. Client 1 Operations share a resource after entering the decision to release the lock client 1 for some reason blocked for a long time. The expiration time is up and the lock is automatically released. Client 2 Gets the correspondinglock of the same resource。 (either implement the uniqueness of the random string, or implement the atomic operation to release the lock) client 1 recovers from the block, performs a del manipulation, and releases the lock held by the client 2. In fact, in the analysis of the third and fourth questions above,If the client is not blocked, but there is a large network delay, it is also possible to cause a similar sequence of execution to occur。a distributed lock based on Redis cluster
The previous four details, as long as the implementation of the attention, can be handled correctly. In addition, the problem caused by failover (failover retry) is not solved by a distributed lock based on a single Redis node. It is this problem that has spawned the emergence of Redlock.
Redlock is designed to standardize the implementation of the Redis-based distributed locks, a more secure implementation by the REDIS authors.
When a single Redis node goes down, none of the clients can get the lock and the service becomes unavailable. To improve usability, we can hang a slave for this Redis node, and when the master node is unavailable, the system automatically cuts to slave (failover). However, because Redis's master-slave Replication (replication) is asynchronous, this can lead to the loss of lock security during the failover process. consider the following execution sequence: Client 1 acquires a lock from master. Master is down and the key to store the lock has not been synchronized to the slave. Slave upgrade to master. Client 2 Gets the lock corresponding to the same resource from the new master.
As a result, client 1 and Client 2 hold a lock on the same resource at the same time. The security of the lock is broken. The Redis authors have designed the Redlock algorithm for possible loss of lock security during Redis master-slave replication.
The client that
runs the Redlock algorithm performs the following steps in turn to complete the operation to acquire the lock : Gets the current time (number of milliseconds). The performs an operation to acquire a lock sequentially to n redis nodes. This fetch operation is the same as the previous procedure for acquiring a lock based on a single Redis node, which contains the random string My_random_value, and also contains the expiration time (for example, PX 30000, which is the effective time of the lock). To ensure that the algorithm continues to run when a Redis node is unavailable, this lock operation has a timeout (time out), which is much smaller than the lock's effective time (dozens of millisecond magnitude) . After a client acquires a lock on a redis node, it should immediately try the next Redis node. The failure here should include any type of failure, such as the Redis node being unavailable, or the lock on the Redis node that has been held by another client (note: The Redlock source only mentions the case where Redis nodes are not available, but it should also contain other failure scenarios). calculates the total amount of time spent in the process of acquiring a lock , calculated by subtracting the time of the 1th step from the current time. If the client successfully acquires a lock from most Redis nodes (>= n/2+1) and acquires a lock for a total of no more than the time it takes to lock (lock validity time) , then the client considers finally acquires the lock succeeds ; otherwise, it is considered to have failed to acquire the lock eventually. If the lock succeeds, the effective time of the lock is equal to the effective time of the initial lock minus the time taken by the 3rd step to obtain the lock consumption . If the final acquire lock fails (possibly because the number of REDIS nodes that acquired the lock is less than n/2+1, or if the entire acquisition of the lock takes longer than the lock's initial effective time), then the client should Immediately initiates a release lock operation to all Redis nodes (the Redis Lua script described earlier).
Of course, the procedure described above is simply the process of acquiring a lock, and the process of releasing the lock is simple: The client initiates the release of the lock to all REDIS nodes, regardless of whether the nodes were successful in acquiring the lock at that time. That is, even if the lock is not successful at that time, the node should not be missed when the lock is released . This is because there is a case where the client's request to acquire a lock on a Redis node has successfully reached the Redis node, and the node successfully performed the set operation, but the response packet it returned to the client was lost. In the client's view, the request to acquire the lock failed due to a timeout, but on the Redis side it seems that the lock has been successfully added . Therefore, when the lock is released, the client should also initiate a request for those Redis nodes that have failed to acquire the lock at that time. In fact, this situation is likely to occur in the asynchronous communication model: it is normal for the client to communicate with the server, but there is a problem in the opposite direction.
Because most of the n Redis nodes are working correctly, the redlock can be guaranteed to work properly, so it is theoretically more usable. The problem of lock invalidation in the failover of a single Redis node, which we discussed earlier, does not exist in the Redlock , but if there is a crash restart on the node, the security of the lock will be affected. the degree of impact is related to the degree to which Redis is persistent with data.
Suppose a total of 5 Redis nodes: A, B, C, D, E. It is envisaged that the following sequence of events occurred: Client 1 successfully locked a, B, C, get lock succeeded (but D and e are not locked). Node C crashes and restarts, but the lock added by client 1 on C is not persisted and is lost. After node C restarts, client 2 locks C, D, E, and acquires the lock successfully.
In this way, both client 1 and Client 2 acquire locks (for the same resource).
by default , Redis's aof persistence is to write the disk once per second (that is, to execute fsync), so it is possible to lose 1 seconds of data in the worst case scenario. to minimize data loss, Redis allows Fsync to be set to every modification, but this degrades performance . Of course, even if Fsync is executed, it is still possible to lose data (depending on the implementation of the system rather than the Redis). Therefore, there is always a possibility of a lock failure problem caused by a node crash restart. To deal with this problem, the author proposes the concept of delayed restart (delayed restarts) . That is, after a node crashes, it is not immediately restarted, but instead waits for a period of time and then restarts, which should be greater than the lock's effective time (lock validity times). In this case, the lock that the node participates in before the restart expires, and it does not affect the existing lock after the reboot.
Martin (a distributed expert) thinks that Redlock is overly dependent on system timing (timing) (Redis has an ex or PX time dependent on server time, and if you manually tune the server time to make it larger than the effective time, the key pair will expire immediately). He first gave the following example (or false) with 5 Redis nodes A, B, C, D, E): Client 1 successfully acquired the lock (majority node) from Redis Node A, B, C. Communication with D and E failed due to network problems. The clock on node C jumps forward, causing the lock it maintains above to quickly expire . Client 2 successfully acquired the same resource's lock (majority node) from Redis node C, D, E. Both client 1 and client 2 now consider themselves to be holding locks.
Redlock Security (Safety property) on the system's clock has a relatively strong dependence, once the system clock becomes inaccurate, the security of the algorithm will not be guaranteed. Martin is here to point out some common-sense issues in the research of distributed algorithms, that good distributed algorithms should be based on asynchronous models (asynchronous model), and that the security of the algorithm should not depend on any time-of-mind assumptions (timing assumption) . In the ASYNC model: The process may pause for any length of time, the message may be delayed for any length of time in the network, or even lost, and the system clock may fail in any way. A good distributed algorithm, These factors should not affect its security (safety property), it can only affect its activity (liveness property), that is, even in very extreme cases (such as system clock serious error), At best, the algorithm cannot give the result in a limited amount of time, and should not give the wrong result. such algorithms exist in reality, like the more famous Paxos, or raft. But obviously according to this standard, Redlock's security level is not reached.
Martin also made a very insightful point of view, the distinction between the use of locks. He divides the use of locks into two types:
for efficiency (efficiency), coordinate with each client to avoid duplication of effort . Even if the lock occasionally fails, it is possible to do some operations more than once, no other undesirable consequences. For example, send a duplicate email.
for correctness (correctness). Lock failure is not allowed under any circumstances , because once it happens, it can mean inconsistent data (inconsistency), data loss, file corruption, or other serious problems.
Finally, Martin came to the following conclusion:
If a distributed lock is used for efficiency (efficiency), allowing the occasional failure of a lock, then a single Redis node lock scheme is sufficient, simple and efficient. Redlock is a heavy implementation (heavyweight).
if you are using distributed locks for correctness (correctness) in very serious situations, do not use Redlock. It is not a strong enough algorithm built on the asynchronous model, it contains many dangerous components (for timing) in the assumptions of the system model. You should consider a distributed lock scheme like zookeeper (which is also the current enterprise popularity scenario), or a database that supports transactions.
In this case, Martin believes that there are three main types of redlock that will fail: clock jumps. A long time GC pause. Long-time network latency.
In the latter two cases, Redlock in the original design has been taken into account, to their consequences have a certain immunity. And the effects of large delays on redlock are consistent with all distributed locks, and this effect is not just for redlock. The implementation of Redlock has ensured that it is the same security as any other distributed lock .
The key is clock jumping, and the Redis author believes that with proper operation, the clock can be completely avoided, and Redlock's requirements for the clock are fully satisfied in the real system. (In practice: Clock skew is present in the real world)
Martin mentioned two specific examples of clock jumps that could be caused by a clock jump:
The system administrator manually modified the clock.
A large clock update event was received from the NTP service.
The Redis authors retort that:
Manually modify the clock this man-made reason, don't do that. Otherwise, if someone manually modifies the raft protocol's persistent log, then it will not work properly, even if it is a raft protocol.
Using a NTPD program that does not "jump" to adjust the system clock (perhaps through proper configuration), changes to the clock are done through several minor adjustments.
Redlock's requirements for clocks do not need to be completely accurate, it only requires the clock to be almost accurate. For example, to remember the time 5 seconds, but may actually remember 4.5 seconds, and then remember the 5.5 seconds, there is a certain error. However, as long as the error does not exceed a certain range, this has no effect on redlock. Antirez that, like this is not very high clock accuracy requirements, in the actual environment is completely reasonable. a distributed lock based on ZK
Flavio Junqueira is one of zookeeper's authors. He gives a description of building a distributed lock based on zookeeper (which is not the only way, of course):
The client tries to create a znode node , such as/lock. Then the first client is created successfully, equivalent to the lock , while the other client creates a failure (Znode already exists) and acquires a lock failure.
When the client accessing the shared resource with the lock is complete, the Znode is deleted so that the other clients can then acquire the lock.
Znode should be built into ephemeral. This is a feature of Znode, which guarantees that if the client that created Znode crashes, the corresponding znode will be automatically deleted. This ensures that the lock must be released.
It looks like this lock is quite perfect, there is no problem with the Redlock expiration time, and can let the lock be released automatically when needed. But careful examination of the words, not necessarily.
How does zookeeper detect that a client has crashed? In fact, each client maintains a session with a zookeeper server, which relies on a regular heartbeat (heartbeat) to maintain. If the zookeeper is not receiving the client's heartbeat for a long time (this time is known as the sesion expiration time), then it thinks the session is out of date, and all the ephemeral types created by the session will be automatically deleted by the Znode node.
Consider the following sequence of executions: Client 1 created the Znode node/lock and acquired the lock. Client 1 has entered a long time GC pause. The session that the client 1 connected to zookeeper expired. The Znode node/lock is automatically deleted. Client 2 creates the Znode node/lock, which obtains the lock. Client 1 recovers from GC pause, and it still considers itself to hold the lock.
Finally, both client 1 and client 2 consider themselves to have a lock and conflict. This is similar to what Martin described earlier in the article as a result of a distributed lock failure caused by GC pause .
It seems that distributed locks implemented with zookeeper are not necessarily secure. There's a problem with it. However, zookeeper as a framework for providing scenarios specifically for distributed applications, it provides some very good features that are not available in scenarios such as Redis. An example is the automatic deletion of the ephemeral type of znode as mentioned earlier.
Another useful feature is the zookeeper watch mechanism . This mechanism can be used in this way, for example, when the client tries to create a/lock, it finds that it exists, and the creation fails, but the client does not necessarily fail to acquire the lock. The client can enter a wait state and wait for the/lock node to be deleted, zookeeper notifies it through the watch mechanism, so that it can proceed with the creation operation (acquiring the lock). This allows the distributed lock to be used by the client as a local lock: The lock failure is blocked until the lock is acquired . Such a feature redlock not be possible.
To summarize, there are two differences in implementation characteristics between zookeeper-based locks and Redis-based locks:
under normal circumstances, the client can hold the lock for any length of time, which ensures that it completes all required resource access operations before releasing the lock. This avoids the dilemma of how long the Redis-based lock is set for the effective time (lock validity. in fact, zookeeper-based locks rely on the session (heartbeat) to maintain the holding state of the lock, and Redis does not support sesion.
Zookeeper-based distributed lock support waits for the lock to be released after it has failed to be freed after it is acquired . This allows the client to use the lock more flexibly.
By the way, the implementation of the distributed lock based on zookeeper, as described above, is not optimal. It causes "herd effect" (herd effect) to reduce the performance of acquiring locks. A better implementation, see the following links:
Http://zookeeper.apache.org/doc/r3.4.9/recipes.html#sc_recipes_Locks