Lock, LockFree, MemoryBarrier, ConcurrentCollection

Source: Internet
Author: User

I have recently read some experiences in parallel programming books, and I have simply recorded several concepts that must be learned by multithreading and parallel programming, so that I can further understand them.

. NET Framework4 provides a new namespace System. collections. concurrent is used to solve the thread security problem of common sets in concurrency (ps: Through this namespace, you can also access the custom partition ER for Parallel Loops and PLINQ ). All thread security sets in this namespace use the lock-free technology to some extent. That is to say, these sets avoid using typical mutex heavyweight locks by using Compare And Swap (CAS) commands And Memory Barrier, although locking in business systems with low performance requirements in actual development can achieve the most economical development benefits.

 

I. Lock

Locks are always indispensable in the multi-thread and concurrent topics. locks are the simplest but most costly method to implement concurrency, locking is a very cost-effective solution in some scenarios.

The lock is easy to use and the code is well maintained, but it must be clear that it cannot be abused. Otherwise, it may cause serious performance problems. Because locking will increase the overhead for switching between the system kernel state and the user State, as well as the thread scheduling overhead. We know that locking and Releasing locks will lead to context switching and scheduling delay. The thread waiting for the lock will be suspended until the lock is released. Kernel-mode Locks require the operating system to perform a context switch. During context switching, the commands and data cached before the cpu will fail, with the greatest impact on performance.

1. Three basic principles of using locks

A. Do not use locks

B. Small-granularity locks are used. Common locks include mutex locks and ReaderWriterLockSlim and ReaderWriterLock.

C. Lock as short as possible

 

2. synchronization object

To encapsulate the lock logic, a synchronization object is usually required. For example, in a common simple and crude synchronization code, the lock (something) thing is a synchronization object.

The synchronization object must be of the reference type (a string is usually not suitable for Synchronous objects, think about why), and it is usually private, usually an instance or static field.

To precisely control the scope and granularity of the lock, we usually create a dedicated field, such as locker and asyncObj;

Avoid using lock (this), lock (typeof (sometype) or lock (string). this method cannot encapsulate the lock logic, and it is difficult to avoid deadlocks and Excessive blocking, even in a process, the app domain boundary may overflow.

 

3. How to Reduce the lock?

If your design involves a lot of shared data for reuse, it is inevitable to use the Lock or LockFree synchronization algorithm in a multi-thread high concurrency environment.

Based on experience, when we need to access Shared writable fields, we can usually synchronize them through locks.

To reduce the lock, we need to reduce the use of shared data.

 

Ii. CAS1. Basic Principles

CAS is simply A comparison and exchange. The general logic is that if A is equal to B, C is assigned to.

The CAS operation contains three operands: memory location (V), expected original value (A), and new value (B ).

If the value V at the memory position matches the expected value a, the processor automatically updates the value to the new value B. Otherwise, the processor does not perform any operation.

 

2. Internal Implementation of pseudo code
 CAS(T* ptr, T expected, T fresh){    (*ptr != expected)          ;    *ptr = fresh;     ;}

 

3. Advantages

CAS is a CPU-level operation. It seems that there is only one atomic operation, which avoids the issue of requesting the operating system to determine the lock, so it is generally very fast. CAS operations are based on the assumption that shared data will not be modified. When there are few opportunities for synchronization conflicts, such assumptions can greatly improve the performance. The main advantages are as follows:

A. Avoid the severe performance overhead caused by locking, and reduce the overhead for switching between kernel and user States and thread scheduling;

B. Achieve More detailed parallel control and improve system throughput. In some cases, the performance of key services can be doubled.

 

4. Disadvantages

Although CAS has obvious advantages, there is no free lunch in the world, and LockFree implemented by CAS also has many problems, such:
A. It is related to the memory read/write model of the hardware architecture, so there is a porting problem.
B. The implementation is complex, and its correctness is hard to prove.
(A) Limited by CPU commands
(B) Even a simple data structure must be implemented through complicated algorithms.
(C) ABA Problems
C. Difficult to maintain code
D. livelock

The so-called live lock means that thing 1 can use resources, but it allows other things to use resources first. Thing 2 can use resources, but it also allows other things to use resources first, as a result, both of them have been modest and cannot use resources.

 

Iii. MemoryBarrier

Why MemoryBarrier (memory blocking) is required? The MSDN explanation is:

MemoryBarrierIs required only on multiprocessor systems with weak memory ordering (for example, a system employing multiple Intel Itanium processors ).

Invalid memory access as follows: The processor executing the current thread cannot reorder instructions in such a way that memory accesses prior to the call to MemoryBarrier execute after memory accesses that follow the call to MemoryBarrier. 

Simply put, the processor will optimize the order of CPU commands, and the compiled program may be caused by Compiler Optimization or computer hardware structures such as distributed systems, it is not executed in the encoding sequence, leading to unexpected problems.

Memory Barrier is a solution that ensures that the statements are executed in sequence at the underlying layer and calls Thread. the memory access in the code after MemoryBarrier () cannot be completed before that, that is, it can limit the cache of instruction rearrangement and memory read/write.

Refer:

Http://stackoverflow.com/questions/3556351/why-we-need-thread-memorybarrier

Barrier class, which allows multiple tasks to synchronize concurrent jobs in different stages.

 

4. Parallel collection

The main thread-safe parallel Collections in the System. Collections. Concurrent namespace are as follows:

1. ConcurrenctQueue <T>

ConcurrenctQueue is the concurrent version of System. Collections. Queue. It is a set of FIFO (Fisrt In, First Out, First Out.

ConcurrenctQueue is completely unlocked, but when the CAS operation fails and faces resource contention, it may spin and retry the operation.

 

2. ConcurrenctStack <T>

ConcurrenctStack is a concurrent version of System. Collections. Stack. It is a set of LIFO (Lastt In, First Out, followed by First-In-First-Out.

ConcurrenctStack is completely unlocked, but when the CAS operation fails and faces resource contention, it may spin and retry the operation.

 

3. ConcurrenctBag <T>

ConcurrenctBag provides an unordered object set and supports repeated objects. It is useful when order is not considered.

ConcurrenctBag uses many different mechanisms to minimize synchronization requirements and overhead.

ConcurrenctBag maintains a local queue for each thread accessing the set, and, if possible, it will access the local queue in a lockless manner.

ConcurrenctBag is highly efficient in the case of adding or deleting elements (consumption) in the same thread. However, ConcurrenctBag sometimes uses locks. Therefore, the efficiency is very low when the producer and consumer threads are completely separated.

 

4. ConcurrenctDictionary <TKey, TValue>

ConcurrenctDictionary is similar to the dictionary of a classic key-value pair, providing concurrent key-value access. It is a concurrent version implemented by System. Collections. IDictionary.

ConcurrenctDictionary is completely unlocked for read operations. It optimizes the operations that require frequent read operations.

When many tasks or threads add or modify data in the dictionary, ConcurrenctDictionary uses fine-grained locks.

 

5. BlockingCollection <T>

The data structure of BlockingCollection is similar to that of the classic blocking queue. It is a package for an IProducerConsumerCollection <T> instance and provides block and bound capabilities.

BlockingCollection can be used when multiple tasks are added to or deleted from a producer-consumer.

Here we will mention the IProducerConsumerCollection <T> interface, which inherits from IEnumerable <T>, ICollection, and IEnumerable. You can't help but applaud the abstract clapping of IProducerConsumerCollection <T>, IEnumerable <T>, ICollection, and IEnumerable interfaces. It can be said that MS is very far-sighted and adaptable to the design of the collection.

PS: I have also summarized parallel sets and thread security a long time ago. You can refer to the previous article to analyze the implementation of thread security containers.

 

Refer:

<C # advanced tutorial on parallel programming>

Http://msdn.microsoft.com/zh-cn/library/system.collections.concurrent (v = vs.110). aspx

Http://msdn.microsoft.com/zh-cn/library/dd267312 (v = vs.110). aspx

Http://blog.hesey.net/2011/09/resolve-aba-by-atomicstampedreference.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.