Driver Research diary-kernel Synchronization

Last Update:2018-12-06 Source: Internet

Author: User

Tags apc

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Spin lock
------------------------------------------------------
Spin lock is a kind of lock designed to prevent multi-processor concurrency. itIn the kernel, it is widely used in interrupt processing and other parts (for a single processor, to prevent the concurrency in interrupt processing, you can simply disable the interrupt without the need to spin the lock ).

If the protected shared resource is accessed only in the context of the process, it is very suitable to use semaphores to protect the shared resource. If the access time to the shared resource is very short, the spin lock can also be used. However, if the protected shared resource needs to interrupt context access (including the bottom half, namely the Interrupt Processing handle and the top half, that is, the Soft Interrupt), it mustUse spin lock.
A spin lock can only be held by one kernel task at most. If a kernel task attempts to request a spin lock that has been used (held, then this task will always be busy loop-rotate-Wait for the lock to be available again. If the lock is not in contention, the kernel task requesting it will immediately get it and continue. The spin lock can prevent more than one kernel task from entering the critical section at any time. Therefore, this lock can effectively avoid competition for shared resources for Kernel Tasks running concurrently on the multi-processor.
In fact, the intention of the spin lock is to implement lightweight locking in a short period of time. A competing spin lock allows the thread requesting it to spin while waiting for the lock to be available again (especially wasting processing time), so the spin lock should not be held for too long. If you need to lock for a long time, it is best to use a semaphore.
The basic form of the spin lock is as follows:
Spin_lock (& mr_lock );
// Critical section
Spin_unlock (& mr_lock );
Because the spin lock can only be held by up to one kernel task at a time point, only one thread is allowed to exist in the critical section at a time point. This satisfies the locking service required by Symmetric Multi-processing machines. On a single processor, the spin lock is only used as a switch to set kernel preemption. If the kernel preemptible does not exist, the spin lock will be completely removed from the kernel during compilation.
To put it simply, spin locks are mainly used in the kernel to prevent concurrent access to the critical zone in the multi-processor and to prevent competition caused by kernel preemption. In addition, the spin lock does not allow the task to sleep (sleep of a task holding the spin lock will cause an automatic deadlock-Because sleep may cause the kernel task holding the lock to be rescheduled, apply for a lock that you already hold), which can be used in the interrupt context.

Fast mutual exclusion object

On the other hand, quick mutual exclusion can be quickly obtained and released without actual competition. On the negative side, you cannot recursively obtain a fast mutex object. That is, if you have a fast mutex object, you cannot issue APC, which means that you will be in apc_level or higher IRQL. At this level, the thread priority will be invalid, however, your code will be executed without interference unless there is a hardware interruption.

Table. Comparison between kernel mutex and quick mutex

Kernel mutex	Quick Mutual Exclusion
It can be obtained recursively by a single thread (the system maintains a request counter for it)	Cannot be obtained recursively.
Slow speed	Fast
The owner can only receive the "special" kernel APC	The owner cannot receive any APC
The owner cannot swap out the memory.	The priority of the blocked thread is not automatically increased (if it is running at a level greater than or equal to apc_level), unless you use the xxxunsafe function and execute it at the passive_level
It can be a part of multi-object waiting.	Cannot be used as the kewaitformultipleobjects Parameter

Table. Quick mutual exclusion service functions

Service Functions	Description
Exacquirefastmutex	Obtain the quick mutual exclusion. wait if necessary.
Exacquirefastmutexunsafe	Obtain the fast mutual exclusion. If necessary, wait. The caller must stop receiving the APC
Exinitializefastmutex	Initialize a fast mutual exclusion object
Exreleasefastmutex	Release quick Mutual Exclusion
Exreleasefastmutexunsafe	Release the quick mutual exclusion and do not cancel the APC submission prohibition.
Extrytoacquirefastmutex	Obtain quick mutual exclusion. If possible, obtain immediately and do not wait.

Exacquirefastmutex waits for the mutex to become a valid state, then assigns the ownership to the calling thread, and finally promotes the current IRQL of the processor to apc_level. IRQL is promoted to prevent all APC submissions. Exacquirefastmutexunsafe does not change IRQL. You need to consider the potential deadlock before using this "insecure" function to obtain a quick mutual exclusion. The APC routine running in the same thread context must be prevented from obtaining the same mutex or any other objects that cannot be recursively locked. Otherwise, you will be at risk of deadlock that thread at any time.

If you do not want to wait when the mutex is not immediately valid, use the "try to get" function:

ASSERT(KeGetCurrentIrql() < DISPATCH_LEVEL);BOOLEAN acquired = ExTryToAcquireFastMutex(FastMutex);

If the returned value is true, you already have this mutex. If it is false, it indicates that the mutex has been possessed by others and you cannot obtain it.

To release a fast mutex and allow other threads to request it, call the appropriate release function:

ASSERT(KeGetCurrentIrql() < DISPATCH_LEVEL);ExReleaseFastMutex(FastMutex);

ASSERT(KeGetCurrentIrql() < DISPATCH_LEVEL);ExReleaseFastMutexUnsafe(FastMutex);

The reason for quick mutual exclusion is that the process of obtaining and releasing mutex is optimized without competition. The key step to obtain mutex is to automatically subtract and test an integer counter, which indicates how many threads occupy or wait for the mutex. If the test shows that no other thread occupies the mutex, no additional work is required. If the test shows that other threads have the mutex, the current thread will be blocked on a synchronization event, which is part of the fastmutext object. When releasing the mutex, the system must automatically add and test the counter. If the test shows that there is no waiting thread, there is no additional work to do. If there are other threads waiting, the mutex owner needs to call the kesetevent function to release a waiting thread.

Lock Operation
Among the functions that the WDM driver can call, some functions can execute arithmetic operations in a thread-safe and multi-processor-safe way. These routines have two forms,The first form starts with interlocked. They can perform atomic operations.Other threads or CPUs cannot interfere with their execution.Another form starts with exinterlocked and uses spin locks.

Table Lock Operation Service Functions

Service Function Description
Interlockedcompareexchange compares and exchanges two values conditionally
Interlockeddecrement integer minus 1
Interlockedexchange swap two values
Interlockedexchangeadd add two values and return and
Interlockedincrement integer plus 1
Exinterlockedaddlargeinteger is added to a 64-bit integer.
Exinterlockedaddlargestatistic to ulong
Exinterlockedaddulong is added to ulong and the original value is returned.
Exinterlockedcompareexchange64 exchange two 64-bit values

The interlockedxxx function can be called on any IRQL. Since this function does not require spin locks, it can also process paging data at passive_level..Although exinterlockedxxx functions can also be called on any IRQL, they need to operate the target data at a level greater than or equal to dispatch_level, so their parameters need to be in non-Paging memory.The only reason for using exinterlockedxxx is that if you have a data variable that needs to be increased or decreased, and sometimes you need to use other command sequences to directly access the variable. You can explicitly declare the spin lock around multiple access codes for the variable, and then simply use the exinterlockedxxx function to increase or decrease the value.

Interlockedxxx Function
Interlockedincrement adds 1 to the long integer variable in the memory and returns the value after adding 1:

Long result = interlockedincrement (plong );

Plong is the address of a long variable. in concept, the operation of this function is equivalent to the C statement: return * plong, however, unlike a simple C statement, it provides thread security and multi-processor security. Interlockedincrement can ensure that the integer variable is successfully increased by 1, even if the threads on other CPUs or other threads on the same CPU try to change the value of this integer at the same time. As far as the operation itself is concerned, it cannot ensure that the returned value is still the current value of the variable, even if only one machine instruction cycle is passed, because once this increment of 1 atom operation is completed, other threads or CPUs may immediately modify this variable.

Interlockeddecrement: In addition to performing the minus 1 operation, the other operations are the same as above.

Long result = interlockeddecrement (plong );

The interlockedcompareexchange function can be called as follows:

Long target;
Long result = interlockedcompareexchange (& target, newval, oldval );

Target is an integer of the long type. It can be used for both function input and function output. oldval is your guess of the target variable. If this prediction is correct, newval is loaded into target. The internal operation of this function is similar to the following C code, but it executes the entire operation in an atomic way, that is, it is thread-safe and multi-processor-safe:

Long compareexchange (plong pTARGET, long newval, long oldval)
{
Long value = * pTARGET;
If (value = oldval)
* PTARGET = newval;
Return value;
}

In other words, this function always returns the historical value of the target variable to you. In addition, if the historical value is equal to oldval, it sets the target value to newval. This function implements comparison and exchange with atomic operations, and the exchange only happens when the historical value is guessed correctly.

You can also call the interlockedcompareexchangepointer function to perform similar comparison and exchange operations, but this function uses pointer parameters. This function can be defined as an inline function within the compiler, or a real function, depending on the pointer width of the platform during compilation and the ability of the compiler to generate Inline code. The following example uses the comparison exchange function of this pointer version. It adds a structure to the header of a single-chain table without using a spin lock or increasing IRQL:

Typedef struct _ somestructure {
Struct _ somestructure * next;
...
} Somestructure, * psomestructure;
...
Void insertelement (psomestructure P, psomestructure * anchor)
{
Psomestructure next, first;
Do
{
P-> next = first = * anchor;
Next = interlockedcompareexchangepointer (anchor, P, first );
}
While (next! = First );
}

In each loop, we assume that the new element will be connected to the current header of the linked list, that is, the address in the first variable. Then we call the interlockedcompareexchangepointer function to check whether anchor still points to first, even after several seconds. If so, interlockedcompareexchangepointer sets anchor to point it to the new element p. And if the returned value of interlockedcompareexchangepointer is consistent with our assumption, the loop ends. For some reason, if anchor no longer points to that first element (which may have been modified by other concurrent threads or CPUs), we will find this fact and repeat it.

The last function is interlockedexchange. It replaces the value of the integer variable with an atomic operation and returns the historical value of the variable:

Long value;
Long oldval = interlockedexchange (& Value, newval );

As you have guessed, there is also an interlockedexchangepointer function that exchanges pointer values (64-bit or 32-bit, depending on the platform ).

Exinterlockedxxx Function
Each exinterlockedxxx function must create and initialize a spin lock before calling it. Note that the operands of these functions must exist in non-Paging memory because these functions operate on data in the upgraded IRQL.

Exinterlockedaddlargeinteger adds two 64-bit integers and returns the historical value of the number to be added:

Large_integer value, increment;
Kspin_lock spinlock;
Large_integer Prev = exinterlockedaddlargeinteger (& Value, increment, & spinlock );

Value is the number of added parts. Increment is the addition. Spinlock is an initialized spin lock. The returned value is the historical value of the append number. The operation process of this function is similar to the following code, except for the protection of spin locks:

_ Int64 addlargeinteger (_ int64 * pvalue, _ int64 increment)
{
_ Int64 Prev = * pvalue;
* Pvalue = increment;
Return Prev;
}

Note that not all compilers support the _ int64 Integer type, and not all computers can use atomic commands to perform the 64-bit addition operation.

Exinterlockedaddulong is similar to exinterlockedaddlargeinteger, but its operand is a 32-bit unsigned integer:

Ulong value, increment;
Kspin_lock spinlock;
Ulong Prev = exinterlockedaddulong (& Value, increment, & spinlock );

This function also returns the value before the addition of the number.

Exinterlockedaddlargestatistic is similar to exinterlockedaddulong, but it adds the 32-bit value to the 64-bit value. This function has not been made public in DDK since the publication of this book, so I will only give its prototype here:

Void exinterlockedaddlargestatistic (plarge_integer addend, ulong increment );

This function is faster than the exinterlockedaddulong function, because it does not need to return the value before the addition of the number. Therefore, it does not need to use spin locks for synchronization. The operation of this function is atomic, but it is limited to other callers that call the same function. In other words, if you call the exinterlockedaddlargestatistic function on a CPU and the Code on the other CPU is accessing the addend variable, you will get inconsistent results. I will use this function to execute code on intel X86 (not the actual source code) to explain this reason:

MoV eax, addend
MoV ECx, Increment
Lock add [eax], ECx
Lock ADC [eax 4], 0

This code works properly when there is no carry in 32 bits, but if there is a carry in, other CPUs may enter between the ADD and ADC commands, if the exinterlockedcompareexchange64 function called by the CPU copies the 64-bit variable value at this time point, the obtained value is incorrect. Even if each addition command has a lock prefix to protect the atomicity of its operations (between multiple CPUs), the code blocks composed of multiple such commands cannot maintain atomicity.

Mutual lock access to linked lists
The executive component of Windows NT provides three special linked list access functions, which can provide thread-safe and multi-processor-safe linked list access. These functions support double-chain tables, single-chain tables, and a special single-chain table called S-list. In the previous chapter, I have discussed non-interlocked access to single-and double-linked tables. Here, I will explain the mutual lock access of these linked lists.

If you need a FIFO queue, you should use double-stranded tables. If you need a thread-safe and multi-processor-safe push-down stack, you should use the slink table. To use these linked lists in a thread-safe and multi-processor-safe way, you must assign them and initialize a spin lock. But the S-link table does not actually use the spin lock. The slink table has a sequence number, which can be used by the kernel to implement the atomicity of the comparison-exchange operation.

Functions used for mutual lock access to various Linked List objects are very similar, so I will use the function to organize these segments. I will explain how to initialize these three linked lists, how to insert elements into these three linked lists, and how to delete elements from these three linked lists.

Initialization
You can initialize these linked lists as follows:

List_entry doublehead;
Single_list_entry singlehead;
Slist_header slisthead;

Initializelisthead (& doublehead );

Singlehead. Next = NULL;

Exinitializeslisthead (& slisthead );

Do not forget to assign and initialize a spin lock for each linked list. In addition, the storage of the linked list header and all linked list elements must come from non-Paging memory, because the routines are supported to access these linked lists on the upgraded IRQL. Note that the spin lock is not required during the initialization of the linked list header, because there is no competition at this time.

Insert element
A double-linked table can insert elements in the header or tail, but a single-linked table and a slink table can only Insert elements in the header:

Plist_entry pdelement, pdprevhead, pdprevtail;
Psingle_list_entry pselement, psprevhead;
Pkspin_lock spinlock;

Pdprevhead = exinterlockedinsertheadlist (& doublehead, pdelement, spinlock );
Pdprevtail = exinterlockedinserttaillist (& doublehead, pdelement, spinlock );

Psprevhead = exinterlockedpushentrylist (& singlehead, pselement, spinlock );

Psprevhead = exinterlockedpushentryslist (& slisthead, pselement, spinlock );

The return value is the address of the header (or tail) of the chain table before it is inserted. Note: The inserted element address of the linked list is the address of the item structure of the linked list table. This address is usually embedded in a larger application structure. You can call the containing_record macro to obtain the address of the peripheral application structure.

Delete Element
You can delete elements from the headers of these linked lists:

Pdelement = exinterlockedremoveheadlist (& doublehead, spinlock );

Pselement = exinterlockedpopentrylist (& singlehead, spinlock );

Pselement = exinterlockedpopentryslist (& slisthead, spinlock );

If the linked list is empty, the return value of the function is null. You should first test whether the returned value is null, and then use the containing_record macro to obtain the pointer of the peripheral application structure.

IRQL restrictions
You can only call the slink table function at a level lower than or equal to dispatch_level. As long as all references to the linked list use the exinterlockedxxx function, the exinterlockedxxx function that accesses the double-stranded table and single-linked table can be called on any IRQL. These functions do not have IRQL restrictions because they are not interrupted during execution, which is equal to raising IRQL to the highest possible level. Once the interrupt is disabled, these functions obtain the spin lock you specified. At this time, no other code on the same CPU can be controlled, and the Code on other CPUs cannot obtain the spin lock, so your linked list is safe.

Note:
--------------------------------------------------------------------------------
The statement about this rule in the DDK document is too strict, and it assumes that all callers must run on an IRQL below or equal to your interrupt object dirql. In fact, it is not required that all callers are on the same IRQL, nor that IRQL must be smaller than or equal to dirql.
It is best to use the exinterlockedxxx interlock function in one part of the code to access a single-chain table or double-chain table (excluding the slink table), and use the non-interlock function (insertheadlist and so on) in the other part of the code ). Before using a non-Interlock primitive, you should obtain the spin lock used by the call in advance. In addition, the access to the linked list should be lower than or equal to the dispatch_level level. Because the spin lock cannot be obtained recursively.For example:

// Access list using noninterlocked CILS:

Void function1 ()
{
Assert (kegetcurrentirql () <= dispatch_level );
Kirql oldirql;
Keacquirespinlock (spinlock, & oldirql );
Insertheadlist (...);
Removetaillist (...);
...
Kereleasespinlock (spinlock, oldirql );
}

// Access list using interlocked CILS:

Void function2 ()
{
Assert (kegetcurrentirql () <= dispatch_level );
Exinterlockedinserttaillist (..., spinlock );
}

The first function must run at a level lower than or equal to dispatch_level, because the keacquirespinlock function must be called here. The IRQL limitation of the second function is as follows: Assume that function1 obtains the spin lock during the access to the linked list. When obtaining the spin lock, IRQL must be temporarily upgraded to dispatch_level, now let's assume that there is an interrupt on the same CPU on the more advanced IRQL, then function2 gets control, and it calls an exinterlockedxxx function, at this time, the kernel is about to obtain the same spin lock, so the CPU will be deadlocked. The cause of this problem is that code with the same spin lock can run on two different IRQL: function1 at dispatch_level, and function2 at high_level.

Non-lock access to shared data
If you want to extract an aligned data, you can call any interlockedxxx function correctly. The CPU that supports nt must ensure that you can obtain a consistent value at the beginning and end, even if the lock operation takes a short time before and after the data is extracted. However, if the data is not aligned, the current mutual lock access will also prohibit other mutual lock access, so as not to cause concurrent access and obtain inconsistent values. Imagine that if there is an integer that spans the buffer boundary in the physical memory, cpu a wants to extract this integer, cpu B needs to execute an interlock and 1 operation on this value at the same time. So a series of upcoming events may be: (a) cpu a extracts the buffer line containing the high part of the value, (B) cpu B executes the mutual lock increment 1 operation and generates a carry to the high part of the value. (c) CPU A then extracts the buffer line containing the low part of the value. This problem can be avoided by ensuring that the value does not cross a buffer boundary, but the easiest solution is to ensure that the value is aligned based on the natural boundary of its data type, for example, the ulong type is 4 bytes aligned.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More