Analysis of inter-threading communication three: Barriers, signal Volume (semaphores) and comparison of various synchronization methods

Source: Internet
Author: User
Tags semaphore

The previous article discussed the mutex, condition variables, read and write locks and spin locks for thread synchronization, this article will first discuss the use of barriers and semaphores, and give the corresponding code and considerations, the relevant code can also be downloaded on my GitHub , Then the various synchronization methods of threads are compared.

Barriers

Barriers is a different from the previous thread synchronization mechanism, which is mainly used to coordinate multiple threads in parallel (parallel) to accomplish a task together. A barrier object can cause each thread to block until all the threads that co-work on a task are executed to a specified point before the threads continue to execute. The Pthread_join call used earlier can also be seen as a simple barrier, where the main thread requires other threads to exit before continuing. You can use interface Pthread_barrier_init to initialize an object and use Phread_barrier_destory to destroy a barrier object. They declare as follows:

When initializing a Hygienebarrier object using the Pthread_barrier_init interface, the parameter count specifies how many threads are required to execute to the specified point to allow all threads to continue execution. The specified point that each thread executes is the thread itself calling pthread_barrier_wait to indicate that the current thread is executing to the specified point, waiting for the other thread to call pthread_barrier_wait. Pthread_barrier_wait declares as follows:


The function returns 0, indicating that another thread needs to wait for pthread_barrier_wait to be called, and that returning pthread_barrier_serial_thread indicates that all threads can continue to execute down, and that the specific thread returns PTHREAD_ The Barrier_serial_thread is random. The following is a barrier to implement the ordering of multiple threads, the code is as follows:
#include <stdlib.h> #include <stdio.h> #include <pthread.h> #include <limits.h> #include < Sys/time.h> #define NTHR 8 */Number of threads */#define NUMNUM 800L */number of numbers to s ORT */#define TNUM (NUMNUM/NTHR)/* number to sort per thread */long nums[numnum];long Snums[numnum];p thread_barrier_ T b; #define HEAPSORT qsort/* * Compare a long integers (helper function for heapsort) */int Complong (const void *ARG1, C    Onst void *arg2) {Long L1 = * (long *) arg1;    Long L2 = * (long *) arg2;    if (L1 = = L2) return 0;     else if (L1 < L2) return-1; else return 1;} /* * Worker thread to sort a portion of the set of numbers.    */void * THR_FN (void *arg) {Long idx = (long) arg;    Heapsort (&nums[idx], tnum, sizeof (long), complong);    Pthread_barrier_wait (&AMP;B); /* * Go off and perform more work ... */return ((void *) 0);} /* * Merge The results of the individual sorted ranges. */vOID Merge () {long idx[nthr];    Long I, MINIDX, Sidx, num;    for (i = 0; i < nthr; i++) Idx[i] = i * tnum;        for (sidx = 0; sidx < numnum; sidx++) {num = Long_max;                for (i = 0; i < nthr; i++) {if ((Idx[i] < (i+1) *tnum) && (Nums[idx[i]] < num)) {                num = Nums[idx[i]];            Minidx = i;        }} Snums[sidx] = Nums[idx[minidx]];    idx[minidx]++;    }}int Main () {unsigned long i;    struct timeval start, end;    Long long startusec, endusec;    double elapsed;    int err;    pthread_t Tid;     /* * Create The initial set of numbers to sort.    */Srandom (1);    for (i = 0; i < numnum; i++) Nums[i] = random ();     /* * Create 8 threads to sort the numbers.    */Gettimeofday (&start, NULL);    Pthread_barrier_init (&b, NULL, nthr+1); for (i = 0; i < nthr; i++) {err = Pthread_create (&tid, NULL, THR_FN, (void *) (i * tnum));            if (err! = 0) {printf ("can ' t create thread");        return-1;    }} pthread_barrier_wait (&b);    Merge ();    Gettimeofday (&end, NULL);     /* Print the sorted list.    */startusec = start.tv_sec * 1000000 + start.tv_usec;    endusec = end.tv_sec * 1000000 + end.tv_usec;    Elapsed = (double) (endusec-startusec)/1000000.0;    printf ("Sort took%.4f seconds\n", elapsed);            for (i = 0; i < Numnum; i++) {if ((I < (NUMNUM-1)) && (Snums[i] > snums[i + 1]))        printf ("Sort failed!\n");    printf ("%ld,", Snums[i]);    } printf ("\ n"); Exit (0);}
The above procedure, the following points are worth noting:
I) when initializing the Barrier object, the number of threads specified is +1 of the number of worker threads, plus 1 is the primary thread.
II) The return value of pthread_barrier_wait () is not detected in the program thread, which is 0 or pthread_barrier_serial_thread, because we specify the use of the main thread to merge the execution results of other threads.

Signal Volume (semaphores)

Semaphores can be used for inter-process synchronization or for synchronization between different threads within the same process, which is only discussed for thread synchronization for the time being. A semaphore object has an associated integer value that is greater than or equal to 0, and for an initialized semaphore object, two types of operations can be performed: one is to subtract 1 from the corresponding integer value by calling Sem_wait, which is the commonly said P operation, which blocks the thread if the current semaphore value equals 0. The other is by adding 1 to the corresponding value by Sem_post, which is usually said v operation, if the thread is blocked on the semaphore, then one of the threads is awakened, that is, the wake-up thread is returned from the sem_wait call. The two operation declarations are as follows:

#include <semaphore.h>int sem_post (sem_t *sem); int sem_wait (sem_t *sem);
After Linux2.6, NPTL implements the semaphore characteristics required by POSIX. Here is a simple example of using semaphores, the code is as follows:

#include <unistd.h> #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <  Semaphore.h> #define BUFF_SIZE 5/* Total number of slots */#define NP 3/* Total number of producers */#define     NC 3/* Total number of consumers */#define Niters 4/* Number of items produced/consumed */#define Nonitem   -1/* stand for no item*/typedef struct {int buf[buff_size];               /* Shared var */int in;              /* Buf[in%buff_size] is the first empty slot */int out;           /* Buf[out%buff_size] is the first full slot */sem_t full;          /* Keep track of the number of full spots */sem_t empty;          /* Keep track of the number of empty spots */sem_t mutex;    /* Enforce mutual exclusion to shared data */} sbuf_t;sbuf_t shared;void *producer (void *arg) {int I, item, index;    index = (int) arg;        for (i=0; i < niters; i++) {/* Produce item */item = i; /* Prepare toWrite item to BUF */* If There is no empty slots, wait */sem_wait (&shared.empty);        /* If Another thread uses the buffer, wait */sem_wait (&AMP;SHARED.MUTEX);        Shared.buf[shared.in] = Item;        shared.in = (shared.in+1)%buff_size;         printf ("[p%d] producing item%d ... \ n", index, item);        Fflush (stdout);        /* Release the buffer */sem_post (&AMP;SHARED.MUTEX);        /* Increment the number of full slots */Sem_post (&shared.full);    /* Interleave producer and Consumer execution * * if (i% 2 = = 1) sleep (1); } return NULL;    void *consumer (void *arg) {int I, item, index;    index = (int) arg;        for (i=0; i < niters; i++) {/* Prepare to read item to BUF */* If There is no full slots, wait */        Sem_wait (&shared.full); /* If Another thread uses the buffer, wait */sem_wait (&AMP;SHARED.MUTEX);/* Consume item */item = SHARED.B Uf[shareD.out];        Shared.buf[shared.in] = Nonitem;        Shared.out = (shared.out+1)%buff_size;        printf ("------> [c%d] consuming item%d ... \ n", index, item);        Fflush (stdout); Sem_post (&shared.mutex);/* Release the buffer */* Increment the number of empty slots */Sem_post (&amp        ; shared.empty);    /* Interleave producer and Consumer execution * * if (i% 2 = = 1) sleep (1); } return NULL;    int main () {pthread_t IDP[NP], IDC[NC];    int index;    /*initialize an unnamed semaphore*/sem_init (&shared.full, 0, 0);    Sem_init (&shared.empty, 0, buff_size);    /*initialize mutex*/sem_init (&shared.mutex, 0, 1); /* Create NP producer */for (index = 0; index < NP; index++) {pthread_create (&idp[index], NULL, producer    , (void*) index);  }/*create NC Consumers */for (index = 0; index < NC; index++) {pthread_create (&idc[index], NULL,    Consumer, (void*) index); }    /*Wait for all producers and the consumer */for (index = 0; index < NP; index++) {Pthread_join (Idp[index],    NULL);    } for (index = 0; index < NC; index++) {Pthread_join (Idc[index], NULL); } exit (0);}
The result of compiling the run program is as follows:
$GCC-o sem_example-wall-lpthread  sem_example.c $./sem_example [P1] producing item0 ... [P0] Producing item0 ... [P0] Producing item1 ... [P2] Producing item0 ... [P2] Producing item1 ...------> [C1] consuming item0 ...------> [C1] consuming item0 ...------> [C2] Consuming item1. .. ------> [C2] consuming item0 ...------> [C0] Consuming item1 ... [P1] Producing item1 ...------> [C0] Consuming item1 ... [P0] Producing item2 ... [P0] Producing item3 ...------> [C1] consuming item2 ...------> [C1] Consuming item3 ... [P1] Producing item2 ... [P1] Producing item3 ... [P2] Producing item2 ... [P2] Producing item3 ...------> [C0] consuming item2 ...------> [C0] consuming item3 ...------> [C2] Consuming item2. .. ------> [C2] Consuming item3 ...

The above program uses semaphores to achieve a simple problem for multiple producers and multiple consumers. There are several points to note about the above procedure:

I) can be initialized to a semaphore using Sem_init, which is declared as follows:

#include <semaphore.h>int sem_init (sem_t *sem, int pshared, unsigned int value);
Where the parameter value is the semaphore initial value. The parameter pshared is used to specify whether the semaphore is used for synchronization between processes or between threads. If the value of pshared is 0, it is used for synchronization between threads, when the parameter SEM semaphore object address should be visible to the synchronized thread, usually when the semaphore object is declared as a global object, or dynamically allocated on the heap. If the value of pshared is not 0, then the semaphore is used for synchronization between processes, when the value of the parameter SEM should be in shared memory between the synchronization processes, such as through calls such as Mmap or Shm_open to return the address. The sem_init is called again on an already initialized semaphore, and its result is defined at the end.

Also noteworthy about semaphores is that the Linux implementation follows POSIX rules, and after the Linux2.6 version, GLIBC uses the NPTL implementation to fully implement the rules in POSIX. For semaphores used for process communication, the semaphore is persistent with the kernel (kernel persistence): If Sem_unlink is not called in the process, the semaphore consumes resources even if the process ends. In Linux, named Semaphores are created in the virtual file system, usually/dev/shm, and named Sem.somename, which is why the semaphore name length is name_max-4.

Comparison of various synchronization methods

A simple summary of the scenarios and advantages and disadvantages of the various synchronization methods discussed earlier:

I) when access to a piece of data or code requires only one thread to access the situation at a time, you can use the mutex, all the thread contracts before the operation, to obtain a lock on the mutex, and for the mutex, it has this feature: only one thread can have this lock at a time.
II) Conditional variables are typically used in situations where a thread needs a certain condition to execute, otherwise it blocks, and another thread can modify the condition so that the condition is established, and when the condition is set up, the thread that blocks the corresponding condition variable can be awakened using the relevant interface. And the condition variable must be used in conjunction with the mutex, because its implementation relies on the state of the mutex. Combined with conditional variables and mutexes, instead of just using mutexes, you can prevent threads waiting for the condition to be busy waiting.
III) The introduction of read-write locks is a further subdivision of the type of operation, that is, to differentiate between read operations and write operations. In the case of mutexes, either the lock state or the unlocked state, and only one thread can lock it at a time, does not distinguish between the read and write operations that the thread is going to perform, and the read-write lock distinguishes between a thread's reading data lock request and a write data lock request. Thus, the program has higher concurrency than using only mutexes when the number of read operations is much greater than the number of write operations.
IV) When the lock thread that gets the mutex is blocked, the thread goes to sleep, and when the spin lock is acquired, the thread is in a busy wait (busy-waiting) state, i.e. it does not yield the CPU, consumes CPU resources, and repeatedly attempts to obtain a spin lock until it is obtained. Spin locks are suitable for situations where a thread holds a spin lock for a shorter time and the thread does not want to consume the rescheduling cost.
V) barrier is primarily used to coordinate multiple thread parallelism (parallel) to accomplish a task that typically does similar work, except for different data operations, such as performing a sort operation.
VI) The semaphore has a state associated with it (its counter), for the mutex, the counter is 1, and the mutex must always be threads unlocked by its locked line, and the semaphore post does not have to be executed by the same thread executing its wait operation. After the Semaphore post operation, its corresponding counter always adds 1 to be remembered, and if a condition variable sends a signal, if no thread waits on the condition variable, then the signal is always lost. In the poxix.1 rationale, it is stated that the main purpose of providing a semaphore is to provide an inter-process synchronization method with the mutex and condition variables.
The above although the use of various synchronization methods, but usually the most used is the mutex and conditional variables, the other synchronization methods in rare cases will be used. In essence, such as read-write locks and semaphore high-level primitives can be implemented using mutexes and conditional variables.

Resources

Http://man7.org/linux/man-pages/man7/sem_overview.7.html

Http://www.csc.villanova.edu/~mdamian/threads/posixsem.html

http://www.jbox.dk/sanos/source/lib/pthread/

Advanced Programming for the UNIX environment 11.6 thread synchronization



Analysis of inter-threading communication three: Barriers, signal Volume (semaphores) and comparison of various synchronization methods

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.