Deep understanding of locks in IOS development

Source: Internet
Author: User
Tags semaphore

Source: Bole Online-Summer Then

Links: http://ios.jobbole.com/89474/

Click → apply to join Bole Online column author

Summary

The purpose of this article is not to introduce how to use the various locks in iOS, on the one hand, the author does not have a lot of actual combat experience, on the other hand such articles quite a lot, such as iOS to ensure that several ways of thread safety and performance comparison, iOS common knowledge points (three): Lock. This article will not detail the specific implementation of the principle of the lock, which will involve too much relevant knowledge, I dare not fraught.

This article is to do is a simple analysis of the IOS development of several common locks how to achieve, and what the pros and cons are, why there is a performance gap, will eventually simply introduce the underlying implementation of the lock principle. The level is limited, if careless mistake, welcome to communicate correct. At the same time, readers are advised to have a general understanding of how to use the various locks in OC before reading this article.

In Ibireme's no longer secure Osspinlock article, there is a picture that simply compares the unlocking performance of various locks:

This article analyzes the implementation of each lock in order from top to bottom (fast to slow). It should be noted that the addition of the lock speed does not indicate the efficiency of the lock, only to indicate the complexity of the implementation of the unlock operation, the following will be explained by specific examples.

Osspinlock

The above article has described the Osspinlock is no longer secure, the main reason is that when a low-priority thread gets a lock, the high-priority thread enters the busy (busy-wait) state, consumes a lot of CPU time, which causes the low-priority thread to get no CPU time, and cannot complete the task and release the lock. This problem is called priority reversal.

Why does a busy wait cause a low-priority thread to get no time slices? This has to be said from the operating system's thread scheduling.

Modern operating systems typically employ a time-slice rotation algorithm (Round Robin, or RR) when managing common threads. Each thread is allocated a time slice (quantum), usually around 10-100 milliseconds. When a thread runs out of its own time slice, it is suspended by the operating system and placed in the waiting queue until the next time slice is allocated.

Principle of spin lock implementation

The purpose of the spin lock is to ensure that only one thread in the critical section can be accessed, and its use can be described by the following pseudo-code:

Do {

Acquire Lock

Critical section//Critical area

Release Lock

Reminder Section //code that does not require lock protection

}

In the acquire lock step, we apply for lock-in to protect the code in the critical area (Critical section) from being executed by multiple threads.

The idea of a spin lock is simple, theoretically, as long as a global variable is defined to represent the availability of the lock, the pseudo-code is as follows:

BOOL Lock = false; //Not locked at first, any thread can request a lock

Do {

while (lock); //If Lock is true, the loop is always dead, which is equivalent to requesting a lock

Lock = true; //Hang lock so that no other thread can get the lock

Critical section//Critical area

Lock = false; //equivalent to releasing the lock so that other threads can enter the critical section

Reminder Section //code that does not require lock protection

}

The comments are clearly written and are no longer analyzed on a line-by-row basis. Unfortunately, there is a problem with this code: if there are multiple threads executing the while loop at the beginning, they will not be stuck here, but will continue to execute, so that the reliability of the lock is not guaranteed. The solution is also very simple, as long as the process of ensuring the application of the lock is atomic operation.

Atomic operation

The atomic operation in the narrow sense represents a non-disruptive operation, which means that the thread will not be suspended by the operating system during the execution of the operation, but must be executed. In a single-processor environment, an assembly instruction is obviously atomic, because interrupts are also implemented through directives.

In the case of multiprocessor, however, operations that can be performed concurrently by multiple processors are not counted as atomic operations. Therefore, the real atomic operation must be supported by the hardware, such as the x86 platform if the command before the "lock" prefix, the corresponding machine code in the execution of the bus lock, so that the other CPUs can no longer perform the same operation, from the hardware level to ensure the atomicity of the operation.

These very low-level concepts do not need to be fully mastered, as long as we know the process of applying for locks, we can do this with an atomic operation Test_and_set, which can be represented by pseudo-code:

BOOL test_and_set (bool *target) {

bool RV = *target;

*Target = TRUE;

return RV;

}

The purpose of this code is to set the target value to 1 and return the original value. Of course, in a concrete implementation, it is done by an atomic directive.

Summary of Spin Lock

At this point, the principle of the spin lock is very clear:

bool Lock = false ; //Start without locking, any thread can request a lock

do {

    , while ( test _and_set ( & lock ); //Test_and_set is an atomic operation

         Critical section    // Critical section

     lock = false ; //is equivalent to releasing the lock so that other threads can enter the critical section

         Reminder Section //code that does not require lock protection         

}

It is not a good idea to use a spin lock if the critical section takes too long to execute. Before we introduced the time slice rotation algorithm, the thread would exit its own time slice in many cases. One of them is to run out of time slices and be forced to preempt by the operating system. In addition, when a thread makes an I/O operation, or goes to sleep, it will voluntarily make a time slice. Obviously in the while loop, the thread is in a busy state, wasting CPU time, and finally because the time-out is the operating system to seize the Times slice. If the critical section executes for a long time, such as file reading and writing, this kind of busy is unnecessary.

Signal Volume

Before I introduced the GCD bottom implementation of the article briefly describes the signal volume dispatch_semaphore_t implementation principle, it will eventually call to the Sem_wait method, this method is implemented in GLIBC is as follows:

int sem_wait (sem_t *sem) {

int *Futex = (int *) sem;

if (atomic_decrement_if_positive (futex) > 0)

return 0;

int Err = lll_futex_wait (futex, 0);

return -1;

)

First, the value of the semaphore is reduced by one and the number is judged to be greater than 0. If it is greater than 0, the description does not wait, so return immediately. The specific wait operation is implemented in the Lll_futex_wait function, and LLL is the abbreviation for low level lock. This function through the assembly code implementation, call to Sys_futex this system call, so that the thread into the Sleep state, the initiative to make time slices, this function in the implementation of the mutex, it may also be used.

Unsolicited time slices do not always represent high efficiency. Making a time slice can cause the operating system to switch to another thread, which typically takes about 10 microseconds and requires at least two switching. If the wait time is short, such as only a few microseconds, busy waiting is more efficient than thread sleep.

As you can see, the implementation of the spin lock and the semaphore is very simple, which is the reason why the two lock-up time is ranked first and second respectively. Again, it is not possible to accurately reflect the efficiency of the lock (such as time-slice switching), which can only measure the complexity of the lock to a certain extent.

Pthread_mutex

Pthread represents the POSIX thread, which defines a set of cross-platform thread-related Api,pthread_mutex that represent mutexes. Mutexes are implemented in very similar terms to semaphores, instead of using busy and so on, instead of blocking threads and sleeping, which requires context switching.

Common uses for mutexes are as follows:

pthread_mutexattr_t attr;

Pthread_mutexattr_init(&attr);

Pthread_mutexattr_settype(&attr, pthread_mutex_normal); //Define the properties of the lock

pthread_mutex_t mutex;

Pthread_mutex_init(&Mutex, &attr) //Create lock

Pthread_mutex_lock(&Mutex); //Apply for lock

//critical section

Pthread_mutex_unlock(&Mutex); //Release lock

For Pthread_mutex, its usage and the previous not much change, more important is the type of lock, can have Pthread_mutex_normal, Pthread_mutex_errorcheck, pthread_mutex_ RECURSIVE and so on, the specific characteristics do not explain, there are a lot of relevant information on the Internet.

In general, a thread can only request one lock at a time, and can only release a lock if the lock is acquired, and multiple requests for a lock or release of an acquired lock can cause a crash. Assuming that the lock has been acquired again, the thread will go to sleep because it waits for the lock to release, so it is impossible to release the lock again, resulting in a deadlock.

This often happens, for example, when a function requests a lock and then recursively invokes itself in the critical section. Hurrily is Pthread_mutex support recursive lock, that is, allow a thread recursive application lock, as long as the attr type to pthread_mutex_recursive can be changed.

Implementation of mutual exclusion lock

When a mutex is applied for a lock, it calls the Pthread_mutex_lock method, which is implemented differently on different systems, and sometimes its interior is implemented using semaphores, and even without semaphores, the lll_futex_wait function is called, causing the thread to hibernate.

As mentioned above, if the critical section is short, the efficiency of the busy wait may be higher, so in some versions of the implementation, a certain number of test_and_test (such as 1000 times) will be attempted first, which can improve performance when the mutex is used incorrectly.

In addition, because there are many types of Pthread_mutex, can support recursive lock, etc., so when applying for lock, the type of lock needs to be judged, which is why it is similar to the realization of the semaphore, but the efficiency is slightly lower.

Nslock

Nslock is a kind of lock that objective-c exposes to developers in the form of objects, its implementation is very simple, through the macro, defines the Lock method:

#define MLOCK \

- (void) lock\

{\

int Err = pthread_mutex_lock(&_mutex); \

//Error handling ...

}

Nslock just inside encapsulates a Pthread_mutex, the property is Pthread_mutex_errorcheck, it will lose a certain performance in exchange for error prompts.

The reason for using a macro definition here is that there are several other locks inside OC, their lock methods are identical, only the types of internal Pthread_mutex mutexes are different. With macro definitions, you can simplify the definition of a method.

The reason Nslock is slightly slower than Pthread_mutex is that it requires a method call, and because of the cache, multiple method calls do not have much effect on performance.

Nscondition

The bottom of the nscondition is achieved through the conditional variable (condition variable) pthread_cond_t. A condition variable is a bit like a semaphore that provides a thread blocking and signaling mechanism, so it can be used to block a thread and wait for some data to be ready and then wake up the thread, such as a common producer-consumer pattern.

How to use condition variables

Many articles introducing pthread_cond_t will mention that it needs to be used in conjunction with a mutex:

void Consumer () { //consumer

pthread_mutex_lock(&Mutex);

while (data = = NULL) {

pthread_cond_wait(&condition_variable_signal, &mutex); //wait for data

}

//---have new data, the following code is responsible for processing ↓↓↓↓↓↓

//temp = data;

//---have new data, the above code is responsible for processing ↑↑↑↑↑↑

pthread_mutex_unlock(&Mutex);

}

void producer () {

pthread_mutex_lock(&Mutex);

//Production data

pthread_cond_signal(&condition_variable_signal); //Send a signal to the consumer to tell them that they have new data

pthread_mutex_unlock(&Mutex);

}

Naturally, we will have a question: "If you do not use the mutex, only the condition variable what is the problem?" ”。 The problem is that temp = data; This code is not thread-safe, and maybe there are other threads that have modified the data before you read it out. So we need to make sure that the data that consumers get is thread-safe.

The wait method, in addition to being awakened by the signal method, can sometimes be woken up by a false, so it is necessary to make two confirmations of the judgment in the while loop.

Why use conditional variables

There are many articles about condition variables, but most of them are silent on one basic question: "Why use conditional variables?" It only controls the order in which threads are executed, can you simulate a similar effect with semaphores or mutexes? ”

The information on the Internet is relatively small, I briefly say a personal view. The semaphore can replace condition to some extent, but the mutex is not. In the code of the producer-consumer pattern given above, the essence of the Pthread_cond_wait method is the transfer of locks, the consumer abandons the lock, and then the producer obtains the lock, so the pthread_cond_signal is a lock from the producer to the consumer transfer process.

If we use mutexes, we need to change the code to this:

void Consumer () { //consumer

pthread_mutex_lock(&Mutex);

while (data = = NULL) {

pthread_mutex_unlock(&Mutex);

pthread_mutex_lock(&another_lock) //equals wait another mutex

pthread_mutex_lock(&Mutex);

}

pthread_mutex_unlock(&Mutex);

}

The problem with this is that, before waiting for Another_lock, it is possible for the producer to execute the code before releasing the Another_lock. That is, there is no guarantee that releasing the lock and waiting for another lock are atomic, and there is no guarantee of the "wait first, release another_lock" Order.

There is no problem with semaphores, because the wait and wake of semaphores do not need to be prioritized, and semaphores represent only how many resources are available, so there is no such problem. However, there appears to be a risk of using semaphores compared to the atomic lock transfer guaranteed by the pthread_cond_wait (there is no problem with non-atomic operations for the time being).

However, using condition has a benefit that we can call the Pthread_cond_broadcast method to notify all waiting consumers, which is not possible with semaphores.

Nscondition's approach

Nscondition actually encapsulates a mutex and conditional variable, which unifies the lock method of the former and the wait/signal of the latter in the Nscondition object, exposing the user to:

- (void) signal {

pthread_cond_signal(&_condition);

}

In fact, this function is defined by a macro, and that is how it unfolds.

- (void) lock {

int Err = pthread_mutex_lock(&_mutex);

}

Its unlocking process is almost identical to the Nslock, and it should theoretically be the same (as in the actual test). It takes a little longer to show up in the diagram, and I guess it's possible that the tester has initialized and destroyed the variables before and after each additional unlock.

Nsrecursivelock

As already mentioned above, the recursive lock is also implemented by the Pthread_mutex_lock function, the type of lock is judged inside the function, if the display is a recursive lock, the recursive call is allowed, only one counter is added one, and the lock release process is the same.

The difference between Nsrecursivelock and Nslock is that the type of the Pthread_mutex_t object inside the wrapper is different, and the former is of type pthread_mutex_recursive.

Nsconditionlock

Nsconditionlock with the help of nscondition, its essence is a producer-consumer model. "Condition is met" can be understood as a new content provided by the producer. The inside of the Nsconditionlock holds a Nscondition object, as well as the _condition_value property, which is assigned when initialized:

Simplified version of the code

- (ID) initwithcondition: (nsinteger)value {

if (nil! = (self = [super init])) {

_condition = [nscondition new]

_condition_value = value;

}

return self ;

}

Its Lockwhencondition method is actually the consumer method:

- (void) lockwhencondition: (nsinteger)value {

[_condition lock];

while (value! = _condition_value) {

[_condition wait];

}

}

The corresponding Unlockwhencondition method is the producer, using the broadcast method to inform all consumers:

- (void) unlockwithcondition: (nsinteger)value {

_condition_value = value;

[_condition broadcast];

[_condition unlock];

}

@synchronized

This is actually a OC level lock, mainly by sacrificing performance in order to be concise and readable in grammar.

We know that @synchronized need to be followed by an OC object, which actually uses the object as a lock. This is accomplished by a hash table, where OC uses an array of mutexes at the bottom (which you can understand as a lock pool) and obtains the corresponding mutex by hashing the object.

The specific implementation principle can refer to this article: About @synchronized, here is more than you want to know

Resources

    1. Pthread_mutex_lock

    2. Threadsafety

    3. Difference between binary semaphore and mutex

    4. It's more than you want to know about @synchronized.

    5. PTHREAD_MUTEX_LOCK.C Source

    6. [Pthread] Thread synchronization mechanism in Linux (ii) –in GLIBC

    7. Various synchronization mechanisms of pthread

    8. Pthread_cond_wait

    9. Conditional Variable vs Semaphore

Deep understanding of locks in IOS development

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.