Understanding Clojure STM Software Transactional memory

Last Update:2016-06-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Translation instructions:

English Original from: http://java.ociweb.com/mark/stm/article.html

The original text contains some non-STM knowledge, but also includes the STM bottom-level implementation of content, here just translated the STM abstraction layer content, since this part of the more important .

Translation is based on the way you can understand the translation, not a step-by-step translation, the purpose is to understand STM, understand how to tune the STM, the students have to step into the obsessive-compulsive disorder please do not spray me!

I was in the "Clojure programming Fun" under the "pressure of the Ref" chapter, encountered the inability to understand minhistory and maxhistory the time to find this article.

Ref is an atom-and agent-like variable that contains a value shared by a thread, ref can only be modified in an STM transaction, and the function that can be modified is: Ref-set,alter,commute, the way to get the ref value is with the @ Reader or deref, It is not necessary to read the words inside the transaction, but to read a consistent snapshot of multiple ref needs to be inside the transaction.

Ref is the only variable type used by the Clojure STM.

The first parameter of alter and commute is a ref, the second parameter is a function that can return a new value, called a parameter function, and when the parameter function is invoked, it gets the following argument: The value of the current ref, the alter or commute other parameters received, The value returned by the parameter function is the new value of ref.

In most cases, using alter, the case of using commute is to modify ref in parallel, and the modification order is not trivial, which is the same as the commutative (commutative law) in mathematics, the addition Exchange law and the multiplication Exchange law are all order-independent, Using commute is equivalent to saying: I am now going to enable a transaction to modify a ref, but I am not worried that other transactions will have changed the ref before I commit this transaction, because I will also retrieve the latest ref at the time of submission, and then calculate it again, then commit, And that will still ensure that the end result is correct.

Commute usage scenarios such as adding elements to a collection, or counting data in a set, maximum, minimum, average, etc., suppose that a ref holds a set, and if 2 concurrent transactions add elements to this collection at the same time, in most cases it is not important that the transaction a runs first than transaction B, But transaction b plugs a new element into the collection of transaction A, when transaction a does not have to retry running the entire transaction, just use commute to fetch the latest ref and plug the element into it at the time of submission. Assuming ref holds a value that is the maximum value of a collection, transaction a runs before transaction B, but transaction B modifies the maximum value before transaction A, and transaction A has no reason to retry the entire transaction, only using commute to fetch the latest maximum value at the time of submission, and compare the calculations once and then commit.

With alter or REF-SET, when another transaction commits a change before the current transaction, the current transaction will be rolled back and retried. Commute can improve performance without having to meet order-independent conditions.

In a transaction, the arguments passed to the commute are stored in an ordered map, and the map key is Ref,value, which is a list of related functions and parameters, and the order of the map is arranged according to the creation time of ref. When a transaction is being committed, a write lock locks all the ref involved in the transaction in the order of the map, and then, for ref modified with the commute function, invokes the function again to modify the value of ref. This sequential locking of ref ensures that no deadlock occurs. Commute allows the first obtained ref and the second obtained ref value to be different.

In a transaction, a ref that has already been called by commute cannot be modified by alter or ref-set, because the ref that is called by commute means that it will be changed by commute again during the commit phase. When Alter acquires and modifies this ref after commute, the commit phase STM is unable to determine if it should roll. (This paragraph is not quite understood, need to look back again)

Sometimes it is necessary to prevent other transactions from modifying a ref, also called write bias, that the current transaction will read or modify, and you can use the ensure function to resolve the write bias. Ensure can guarantee that ref cannot be modified by other transactions, but it does not guarantee that this transaction can modify ref, because other transactions may also use ensure to prevent the current transaction from modifying this ref.

Before delving into clojurestm, you have to understand validators and watchers. Validators is a function that calls this function whenever a mutable type of variable is modified, and if the function returns False or throws an exception, then the modification is invalid. A variable of each mutable type uniquely corresponds to a validator checksum function, and the set-validator! function allows you to specify a checksum function for a variable type Variant. There are 2 ways to know that variable type variables are modified: Watch observation function and watcher agent. Watch observation function must receive 4 parameters, one is the only key, one is a mutable type variable, the old value and the new value. Key can be used to indicate the purpose of the Watch function or any other data. Each key corresponds to a variable and corresponds to a watch function. We can use the Add-watch function to specify an observation function for a variable. The Add-watch function receives 3 parameters, variables, keys, and watch functions, which can be used to unbind the observed function with the Remove-watch function. The Remove-watch function receives 2 parameters, variables, and keys. The Watcher agent is able to receive an action when the variable is modified, and the action is a function that the agent passes the current value and the variable to the action, noting that no old value is passed.

The next step is to look at the Clojure STM implementation from the level of abstraction, but only for the 1.0 version of Clojure, the subsequent version may have a different implementation, so the next thing to discuss is not completely representative of the Clojure STM. To be aware, it is not necessary to understand the internal implementation of STM and to use it correctly without understanding the internal implementation. But understanding internal implementations is still useful.

The 1.0 version of the CLOJURESTM implementation mixed with Java and Clojure source code, after some Clojure version completely implemented STM in Java, and recently there are many to increase the Clojure code for the work in silence, once the work is done, What we are discussing here is out of date. Do not lose heart, however, and the STM implementation mechanism will not change much. I will be in the Clojure update STM implementation, synchronous update this article, to help you understand the internal implementation, not to see the obscure source.

The STM implementation of Clojure is based on MVCC (multi-version concurrency control and snapshot isolation, which means clojure implements these two abstract concepts. The main difference between the standard definitions and Clojure implementations of these two abstractions is that Clojure uses memory instead of database tables. The following is a definition of MVCC, which is related to the standard definition in parentheses.

MVCC uses timestamps or transactional IDs to run serially, MVCC ensures that a transaction does not wait for this object by maintaining an object (or database) that has multiple versions. Each version of the object contains a rewrite timestamp, and each transaction contains a transaction timestamp, and when the transaction reads the object, it fetches the latest version of the overwrite timestamp before the transaction timestamp. If the transaction ti wants to overwrite an object, and transaction TK also wants to overwrite the object, TI's transaction timestamp must precede TK's transaction timestamp before TI can successfully overwrite the object. That is, to have a transaction to complete the write action, its transaction timestamp must be the oldest one. Each object has a read timestamp, assuming that the transaction TI wants to modify the object p, if the transaction timestamp precedes the read timestamp, TI is discarded and tries again, otherwise ti creates a new version of P, and sets the overwrite timestamp of this version as the transaction timestamp, setting the read timestamp of the object as the transaction timestamp. Note Clojurestm does not use read timestamps. A significant disadvantage of this implementation is the cost of saving multiple versions of objects (stored in a database), the advantage being fast reads, because the read is not blocked, is suitable for reading intensive work, it is also suitable for implementing a "True quarantine snapshot," A true quarantine snapshot enables concurrent operations to be performed at very low cost or not fully executed. In isolated snapshot mode, when a transaction is started, a snapshot is taken, as if the transaction is exclusive to the object (the database), and the transaction runs to the commit phase and is judged when the snapshot is not successfully committed without being modified by another transaction.

One drawback of isolating a snapshot is that it causes write bias write skew, which refers to concurrent transactions that read a set of objects and modify other objects in this set of objects based on some objects in the set of objects, which are constrained. For example, there is a town that strictly restricts each family to a maximum of 3 pets, the pet can only be a cat or a dog, Li Lei has a dog, his wife Han Meimei a cat, when Li Lei adopted another dog, Han Meimei adopted another cat, they synchronicity concurrent, note that the transaction can only see the other committed successful transactions , the transaction does not see any other internal data that is not mentioned. Li Lei's affairs were modified by the number of dogs they owned, which did not violate the limit of 3, and Han Meimei's affairs were the same, satisfying the conditions of submission, which led them to have 4 pets, because Li Lei modified the number of dogs, Han Meimei modified the number of cats, and when the transaction was submitted, is based on whether other transactions have modified the transaction to modify the object to decide, regardless of their husband and wife synchronicity submitted successively, always do not modify the other party to modify the object, the transaction will always be successfully submitted. Clojure provides the ensure function to prevent write deviations.

CLOJURESTM implements the lock and lock free policy, the lock is released immediately after the transaction takes the lock, rather than the entire transaction process holding the lock. The lock free policy is used to mark whether a ref variable has been modified by a transaction. Writing concurrent code with Clojure is simpler than the set of explicit locks, Clojure creates a transaction using the DoSync function and passes in a set of expressions (also known as the transaction Body body), and does not need to indicate which ref might be modified by the transaction. But developers still have to distinguish which code should be placed in the DoSync, because a set of ref variables that are read or modified in a transactional body have a consistent state, read outside the transaction is not guaranteed to be consistent, and the ref variable cannot be modified outside the Clojure transaction body.

The concurrency classes that currently use Java for clojurestm are:
Java.util.concurrent.AtomicInteger
Java.util.concurrent.AtomicLong
Java.util.concurrent.Callable
Java.util.concurrent.CountDownLatch
Java.util.concurrent.TimeUnit
Java.util.concurrent.locks.ReentrantReadWriteLock
The Clojure classes used include:
Clojure.lang.LockingTransaction
Clojure.lang.Ref

The DoSync macro wraps the transaction body, the DoSync macro calls the Sync macro first, and the sync macro calls the Lockingtransaction static method Runintransaction. The sync macro passes the transaction body as an anonymous function to runintransaction. Each thread holds a Lockingtransaction object, which is stored in a threadlocal variable, and the Lockingtransaction object is a transaction, creating the object is creating the transaction, The method that is called in this object is to invoke the method inside the transaction. Threadlocal can ensure that threads access objects that are accessed by threads themselves. Runintransaction will first determine if the current thread has not yet held the Lockingtransaction object, then it will create one and then run the anonymous function from sync over the transaction object. If the current thread has already run a transaction that already holds the Lockingtransaction object, then it will run your function in the transaction object.

The status of the transaction has the following 5 types
RUNNING In operation
Committing In submission
RETRY Retry
Killed Flutter Street
COMMITTED Submit Success
When in a retry state, the transaction will try again, but it has not yet begun to try. If you start retrying, the status becomes run. There are 2 scenarios that cause the transaction to go to the street: 1, call the Abort method in a transaction, set the transaction state to Flutter Street and throw the abortexception exception, the transaction aborts and does not retry, and the current version of Clojure does not call the code of the Abort method. 2, calling the barge method in a transaction, sets the transaction state to flapping street, but allows the transaction to retry.

Each ref object has a tvals field that contains a series of historical commit values for this ref, and the length of the tvals is not shorter and only gets longer. The length of the Tvals field is controlled by another 2 fields of the Ref object: Minhistory and Maxhistory, which are 0 and 10 by default, and the 2 fields of the different ref objects can vary from one Use the ref-min-history and ref-max-history functions to modify them. Do not overlook the importance of this tvals length, please refer to the faults section below. Each ref object has a Reentrantreadwritelock (reentrant read-write lock). For a ref object, there can be any number of concurrent transactions holding this ref read lock, and only one transaction holds the rewrite lock of this ref. In only one case, the entire life cycle of a transaction holds a read lock, which is when the ensure is modified with ref, in which case a transaction holds a read lock until ref is modified in the transaction or the transaction commits. It does not happen that a transaction has a rewrite lock for the entire life cycle. The transaction acquires the overwrite lock in some cases and then releases it, acquires the overwrite lock again when the transaction commits, and releases the more specific information about the lock after the commit is completed, referring to the relevant paragraph of the Lock field in the implementation hierarchy Clojurestm section below.

When a call to Ref-set or alter modifies a ref, it obtains a transaction intrinsic value of this ref, which is invisible to the external transaction and becomes externally visible after the commit succeeds. The call also modifies the Tinfo field of ref, which describes the order of transactions that have modified this ref, as well as information such as the current transaction state. A transaction is to read Tinfo to see if ref is being modified by another transaction. Tinfo can be imagined as a ticket, ref holds tickets to the commit phase, but a ticket can only enter the commit phase of a transaction. For more specific information about tinfo, refer to the following section on the Lock field for implementing the clojurestm part of the hierarchy.

Each Lockingtransaction object has a vals field that maintains a Map,map key that contains the internal value of the transaction, which is a ref object, and the Val of map is the value of ref, and the type of the value is java.lang.Object. If the transaction cycle is read only for ref, then the value is obtained from the Tvals field, and the efficiency is relatively low when read multiple times. During the transaction cycle, when ref is modified for the first time, the new value is stored in the Vals field, and subsequent operations within the transaction cycle are accessed from Vals.

A "fault" failure occurs when the transaction internally reads a ref, which does not have a transaction intrinsic value (the Vals field of the transaction does not have the ref key) and the Tvals field in ref does not find a value earlier than the time the current transaction started. When a failure occurs, the transaction is retried.
Assuming that a ref has never experienced a fault fault, and then does not experience a fault fault, and its minhistory is 3,maxhistory is 6, then the length of tvals will grow to 3 and remain at 3, and will not continue to grow.
Assuming that a ref has never experienced a fault fault, and then does not experience a fault fault, and its minhistory is 0, the tvals length will not exceed 1.
Assuming that a ref has experienced a fault fault and that the tvals length of this ref is less than maxhistory, then transaction a commits a modification to that ref, and the tvals of this ref adds a node. The length of the tvals may be between Minhistory and Maxhistory.
Assuming that a ref has experienced a fault fault, and the length of this ref tvals is equal to Maxhistory, and transaction a commits a modification to that ref, then this ref tvals adds a node and removes the oldest node, The length of the tvals is maxhistory.

Barge is used to describe whether another transaction should retry if the current transaction continues to run. When a transaction a tries to barge (break in) another transaction B, only satisfies these 3 conditions to break into the success: 1,a must have been running at least 10 milliseconds, 2,a transaction start time must be earlier than B, that is, the old transaction takes precedence over the new transaction, 3,b must be in the running running state and can be successfully modified into the killed on a break-in state, that is, b if in the submission state, B will not be intrusive.

Retry means that the transaction discards its modification of ref and returns to the place where the transaction body started and executes again. There are 4 scenarios in which retries occur:
1, when a transaction body modifies a ref with a ref-set or alter, it gets the lock of this ref,
A, if the other transaction already occupies the read or overwrite lock of this ref, then the current transaction does not get a rewrite lock of ref.
b, if the current transaction starts, there are other transactions that have submitted a modification to this ref.
C, there is another transaction B that is modifying this ref, but not yet committed, and transaction B attempts to break into other transactions and attempt to fail barge
2, transaction a tries to read the value of ref, but:
A, another transaction B has entered transaction A, causing the state of transaction A to be not running
B,ref does not have a transaction intrinsic value, and there is no historical value earlier than the start time of this transaction, that is, fault occurs.
3, when the transaction body modifies a ref with Ref-set,alter,commute,ensure, another transaction succeeds barge break into the current transaction, causing the state of the current transaction to become non-running.
4, the current transaction is committing, but the other transaction makes a transactional modification and attempts to break into the current transaction, and fails
Transactions do not have unlimited retries, in the Lockingtransaction object, there is a Retry_limit constant, the current version of Clojure is set to 10,000, if the retry exceeds this number, an exception will be thrown.

The retry is triggered by a java.lang.Error subclass Retryex, which is defined in the Lockingtransaction class, and does not use the exception subclass to trigger the reason: this will not be caught by the user The exception block is intercepted. The retry code contains a try block that intercepts the Retryex, which simply returns to the beginning of the transaction, and the try block does not intercept anything else, so if any other exception occurs, the transaction is interrupted.

In the clojurestm implementation, there are many ways to throw illegalstateexception, if the exception is thrown within a transaction, will not retry, the following conditions will throw this exception:
1, the current thread attempts to acquire Locktransactin, but this object does not exist, such as when calling Ref-set, alter, commute, or ensure outside the transaction
2, try to get ref, but not get the value, for example, ref is not initialized
3, try using Ref-set or alter to modify a ref that has been modified by commute within the transaction
4,ref validation function returns False or throws an exception
Deadlock deadlock, Live lock Livelock, race condition race condition will not occur in clojurestm.

Understanding Clojure STM Software Transactional memory

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Understanding Clojure STM Software Transactional memory

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Understanding Clojure STM Software Transactional memory

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support