A bloodcase caused by a lock-free Message Queue: how to be a real programmer? (2) -- month: spin lock, queue spin
Prefix
A bloodcase caused by a lock-free Message Queue: how to be a real programmer? (1) -- location: Cause
A bloodcase caused by a lock-free Message Queue: how to be a real programmer? (2) -- month: spin lock
Parallel Time and Space
When I copied the line above, I stopped first. It was the first place. Although I knew how to write it, it was still messy.
I suddenly thought that, since it is so tangled, so chaotic, so overwhelmed, we should not change the perspective. I remember one of the few novels "Muslim funeral" that I saw in High School, written by Hoda (female ), the story describes two love tragedies that happen in different times and have different content but are intertwined. One is the story of "Jade" and the other is the story of "month. In terms of structure, the crossover model is adopted. One chapter is written as "Jade", and the other chapter is written as "month". The two generations are respectively written as fate. The method is novel and has a special flavor. In terms of the film language, it is a parallel network. It's better to use it here. That's the decision!
The reason why I think of this is that before I wrote the second half of the first article yesterday afternoon, I specifically searched for the keyword "spin lock" based on my attitude towards the readers, although it is not a stranger, it is necessary to think that the word has never been carefully studied by a search engine. There are not many useful information in the search results (not to say it is useless, but it is useless), but one of them is still good, this also provides a good idea and theoretical basis for explaining spin locks.
This article is "Practice of using spin locks to replace mutex locks". This is a translation. The original Article is linked to Practice of using spinlock instead of mutex. Let's talk about it. Yesterday I studied disruptor, some important articles in the search are also in the Concurrent Programming Network (ifeve.com), and the disruptor is updated, the article will also be updated, although it cannot keep up with the update speed of the disruptor :(
Here I will explain two definitions of Time and Space: "Earth" and "moon", Earth: noisy and noisy, full of competition, this line mainly writes things and free-lock (no-lock programming). I occasionally spray people. Moon: it represents peace and reason, and there is no trace of people, also far from those worldly disputes, this line mainly describes spin locks (mixed spin locks, hybrid spinlocks), and the moon is also a symbol of beauty (I originally wanted to choose Mars, but Mars is so bad ...... And the "Martian" doesn't mean anything, you know ).
Okay, let's get started, pineapple and jackfruit!
Time and space shuttle
Hello! Welcome to the moon.
Let's talk about spin locks. Why do we need to talk about spin locks? Didn't we talk about lock-free message queues? Right, but after analyzing q3.h, we found that it is basically equivalent to two spin locks in operation.
As to why it is equivalent to two spin locks, let's talk about it later. Let's first look at mutex locks and spin locks ).
Mutex lock and spin lock
Mutex lock and spin lock are important concepts in multi-thread programming. We use them to lock some shared resources to prevent data inconsistency problems that may occur when these shared data is accessed concurrently. But where are their differences? When should we replace the mutex lock with a spin lock?
Theoretical Analysis
When we use mutex, assuming that the mutex lock has been held by a thread (locked), the other thread will fail when trying to lock the lock, then enter the sleep state to wait for other threads to run. This thread will remain dormant until the lock-holding thread releases the lock before it will be awakened. Let's take a look at the spin lock. If a thread fails to hold a spin lock when trying to hold it (that is, other threads have already held the lock first and have locked it ), this thread will keep trying to hold the modification lock (implemented by the user State spin), then it will not allow other threads to run on the CPU core of the thread, because a core can only run one thread at a time (of course, the operating system can forcibly switch to another thread to run it after the interruption or thread's time slice is used up ).
Problems
The problem with mutex locks is that the sleep and wake-up operations of threads are very expensive. They require a large number of CPU commands, so it takes some time. If the mutex is only locked for a short period of time, it takes longer time to sleep the thread and wake up the thread, it may even take longer than continuous spin lock training. The problem with the spin lock is that if the spin lock is held for a long time, other threads trying to obtain the spin lock will continuously rotate the spin lock status, which will waste the CPU execution time, the key point is that these wasted training cycles are useless. At this time, the thread sleep will be a better choice.
Solution
It makes no sense to use the spin lock on a single-core/Single-CPU system, because it is a running thread/CORE. If you occupy it, other threads will not run, if other threads do not run, the lock cannot be unlocked. In other words, the use of spin locks in Single-core/Single-CPU Systems has no advantage in addition to wasting some time. At this time, if this thread (marked as thread A) is sleep, other threads can run, and then the spin lock may be unlocked. After thread A is awakened again, hold the lock as expected.
In multi-core/multi-CPU Systems, especially when a large number of threads only hold the lock for a short time, if mutex lock is used, wasting a lot of time on thread sleep and wakeup may significantly reduce program running performance. With the spin lock, threads can make full use of the time slice allocated by the system scheduler (often blocking for a short period of time, without having to sleep, and then continue their subsequent work immediately ), to achieve higher processing capability and throughput.
Practice
Programmers often do not know which solution is better. For example, they do not know the number of CPU cores in the running environment, and cannot predict the duration of the locked area. The operating system cannot determine which commands are especially optimized for Single-core or multi-core CPUs. Therefore, most operating systems do not strictly distinguish mutex locks and spin locks. In fact, most operating systems currently use hybrid mutexes and hybrid spinlocks ). What do they mean?
Hybrid mutex lock
Hybrid mutex lock: In a multi-core system, it is initially the same as a spin lock. If a thread cannot obtain or hold a mutex lock, it will not immediately enter the sleep state, because the mutex volume may be quickly unlocked, it is like a spin lock. It is switched to sleep state only after a period of time (or a certain number of times, or other indicators. If it runs on a single-core/Single-CPU system, this mechanism will not spin, nor should it spin (the reason is as mentioned above, with no benefit at all ).
The best example is the critical section in Windows. There is an API called InitializeCriticalSectionAndSpinCount (mutex, dwSpinCount), The dwSpinCount value of this function, the default recommendation for Windows is 4000 (I believe many people familiar with Windows development know it), that is, how to spin 4000 times, and how to spin 4000 times, we do not know in what form of instructions, however, it is true that it is not able to hold the mutex before it truly enters the sleep state after the first attempt to spin dwSpinCount.
Mixed spin lock
Mixed spin locks are initially the same as normal spin locks, but there is a compromise strategy to avoid wasting a lot of CPU time for useless work. This strategy may not switch to sleep state or switch to sleep state as late as possible. First, spin for a period of time and how long the spin will take. The form of spin will be one of the core of this type of spin lock. In addition, another core thing is that you can decide whether to discard the execution of the current thread (give up immediately or wait for a while before giving up, and the waiting time is determined by the spin policy ), giving time slices to other threads improves the possibility of other threads unlocking the spin lock (of course, thread switching may lead to context switching or even switch to another CPU core, leading to cache failure, this may not be shorter than the time when the thread goes to sleep and then wakes up. This is difficult to weigh, but the advantage is that other threads can run. After you go to sleep, let the time slice come out, in essence, you will also encounter the same problem ).
This kind of mixed spin lock is used in the RingQueue written by the author, but the mixed spin lock focuses on the sleep strategy. Different policies have different CPU usage, the execution efficiency varies greatly. For example, in the famous Intel multi-threaded parallel library tbb (Intel Threading building blocks), The spin_mutex (that is, the spin_lock) uses the spin_count * = 2, and the number of spin times increases by two times in turn, while the sleep policy is SwitchToThread () in Windows (this has a big defect and will be discussed later), sched_yiled () in Linux (this is a lot more reasonable to use in Linux, but that one ......, It is not good enough. In terms of details, both Windows and Linux have their own advantages and disadvantages, and they both have defects. We will discuss later, but the advantage of this function is that Linux only has such an interface, unlike Windows, which has too many details and is incomplete, it is prone to fatal defects)
Another example is pthread_spin_lock () in pthread. However, this is an uncompromising spin lock and there is no sleep policy. Then there are several spin_locks provided in boost and one under boost/smart_ptr. The policy is still relatively proper and the spin policy is normal, but Sleep (0) is used in Windows) and Sleep (1), instead of using SwitchToThread (). In another place, it seems to be under boost/log. The Sleep policy only uses SwithToThread (), but not Sleep (0) and Sleep (1 ). Of course, there are also the SpinLock. h and SmallLocks. h in Facebook's folly, and the spin policy is OK, but the hibernation policy uses a similar form of Sleep (1.
The performance of mixed spin locks is determined by the processing duration of the resources you want to lock and the specific hibernation policies of your spin locks. It may be difficult to define the performance, however, we can still provide some relatively rigid indicators, such as the overall running time and the total CPU usage. Although these two indicators are conflicting, if your running time is short, the CPU usage rate is low, which is definitely a better spin lock. That is, we often say that we want the ox to eat the grass and the ox to squeeze out the milk, the minimum efficiency cannot be lower than the mutex lock, which is also a reference indicator to test the advantages and disadvantages of the mixed spin lock.
Sleep
Now, I will write it here today, and I will start to sleep. In fact, I studied disruptor for one night last night, which delayed writing this article. When you read the content in the above article, you will surely wonder why I should use the spin lock or mutex lock. Isn't it better to use the disruptor? Through my understanding and research, this is not actually the case. The disruptor is good but limited. It does not circumvent the real problem of multi-thread programming. It just simplifies the problem, try to use a single thread, or avoid contention between the cache, bus, and resources to the greatest extent, and try to use a single thread to handle the problem. Therefore, it encourages the use of the single-producer mode, which effectively avoids the contention of multiple threads. Therefore, I tried the multi-producer and multi-consumer mode with disruptor, the test results are not much different from general spin locks, and the number of threads may increase, but it may be slower, therefore, the disruptor solution and our problem are basically two different propositions. They are not a single dimension, although they seem similar. However, both of them can be used for reference. The disruptor also defines many sleep policies, but the actual effect is not ideal. Disruptor may take a bit of space in the future.
RingQueue
The GitHub address of RingQueue is: Login. I dare say it is a good mixed spin lock. You can download it and check it out. It supports Makefile, CodeBlocks, Visual Studio 2008,201 0, 2013, CMake, and Windows, minGW, cygwin, Linux, Mac OSX, etc. Of course, ARM may not be supported and no testing environment is available.
(To be continued ...... Coming soon ......)