A simple tool supporting Memory Sharing
POSIX (interface with portable operating system) threads are a powerful means to improve code response and performance. In this series, Daniel Robbins precisely shows you how to use threads in programming. It also involves a lot of background details. After reading this series of articles, you can use POSIX Threads to create multi-threaded programs.
Threads are interesting
Understanding how to correctly use threads is essential to every excellent programmer. A thread is similar to a process.
Like a process, threads are managed by time sharding by the kernel. In a single processor system, the kernel uses time slice to simulate concurrent thread execution. This method is the same as that of the process. In a multi-processor system, threads can execute concurrently, just like multiple processes.
So why is multithreading superior to multiple independent processes for most cooperative tasks? This is because threads share the same memory space. Different threads can access the same variable in the memory. Therefore, all threads in the program can read or write declared global variables. If you have used fork () to write important code, you will realize the importance of this tool. Why? Although fork () allows the creation of multiple processes, it also brings about the following communication problems: how to allow multiple processes to communicate with each other, each process has its own memory space. There is no simple answer to this question. Although there are many different types of local
IPC (inter-process communication), but they all encounter two major obstacles:
Some form of additional kernel overhead is imposed to reduce performance.
In most cases, IPC is not a "natural" extension of code. Generally, the complexity of the program is greatly increased.
Double bad: both overhead and complexity are not good.
If you have made great strides towards programs to support IPC, you will truly appreciate the simple shared memory mechanism provided by the thread. Since all threads reside in the same memory space, POSIX Threads do not need to make costly and complex long-distance calls. By using a simple synchronization mechanism, all threads in the program can read and modify the existing data structure. Instead, you do not need to dump data through the file descriptor or squeeze into a narrow shared memory space. Only one reason is enough for you to consider the single-process/multi-thread mode instead of the multi-process/Single-thread mode.
The thread is quick
Not only that. The thread is also very fast. Compared with the standard fork (), the thread overhead is very small. The kernel does not need to copy the memory space or file descriptor of the process separately. This saves a lot of CPU time, making thread creation faster than New Process Creation 10 to one hundred times. Because of this, you can use a large number of threads without worrying too much about the resulting CPU or memory insufficiency. When fork () is used, a large amount of CPU usage no longer exists. This indicates that a thread can be created as long as it makes sense in the program.
Of course, like a process, a thread will use multiple CPUs. If the software is designed for a multi-processor system, this is really a major feature (if the software is open source code, it may eventually run on many platforms ). The performance of specific types of thread programs (especially CPU-intensive programs) will increase linearly with the number of processors in the system. If you are writing a very CPU-intensive program, try to use multiple threads in the code. Once you have mastered thread encoding, you can solve the coding problem in a new and creative way without using complicated IPC and other complex communication mechanisms.
All these features work together to make multi-threaded programming more interesting, fast, and flexible.
The thread is portable.
If you are familiar with Linux programming, you may know the _ clone () System Call. _ Clone () is similar to fork () and has many thread features. For example, when _ clone () is used, a new sub-process can selectively share the execution environment (memory space, file descriptor, and so on) of the parent process ). This is the good side. But _ clone () also has shortcomings. As the online help of _ clone () indicates:
"_ Clone calling is specific to the Linux platform and is not applicable to Porting Programs. To write a threaded application (multiple threads control the same memory space), it is best to use a library that implements the POSIX 1003.1c thread API, such as the Linux-threads library. See pthread_create (3thr )."
Although _ clone () has many characteristics of threads, it cannot be transplanted. Of course, this does not mean that it cannot be used in the code. However, this fact should be weighed when _ clone () is used in software. Fortunately, as the help of _ clone () online shows, there is a better alternative: POSIX Threads. If you want to write portable multi-threaded code, the code can run on Solaris, FreeBSD, Linux, and other platforms. POSIX Threads are of course a choice.
First thread
The following is a simple example program of POSIX Threads:
Thread1.c
#include <pthread.h>#include <stdlib.h>#include <unistd.h> void *thread_function(void *arg) { int i; for ( i=0; i<20; i++) { printf("Thread says hi!\n"); sleep(1); } return NULL;} int main(void) { pthread_t mythread; if (pthread_create( &mythread, NULL, thread_function, NULL) ) { printf("error creating thread."); abort(); } if (pthread_join(mythread, NULL ) ) { printf("error joining thread."); abort(); } exit(0);} |
To compile this program, you only need to save the program as thread1.c, and then input:
$ gcc thread1.c -o thread1 -lpthread |
Run the following command:
Understanding thread1.c
Thread1.c is a very simple thread program. Although it does not implement any useful functions, it can help understand the thread running mechanism. Next, let's take a step-by-step look at what the program is doing.
The variable mythread is declared in main () and the type is pthread_t. The pthread_t type is defined in pthread. H. It is usually called a "thread ID" (abbreviated as "TID "). It can be considered as a thread handle.
After mythread is declared (remember that mythread is just a "TID" or the handle of the thread to be created), call the pthread_create function to create a real active thread. Do not be confused because pthread_create () is in the "if" statement. Since pthread_create () returns zero when execution is successful, but returns a non-zero value when execution fails, put the pthread_create () function call in the IF () statement to facilitate the detection of failed calls. Let's take a look at the pthread_create parameter. The first parameter & mythread is pointing
Mythread pointer. The second parameter is currently null and can be used to define certain attributes of a thread. Because the default thread attribute is applicable, you only need to set this parameter to null.
The third parameter is the name of the function called when the new thread starts. In this example, the function name is thread_function (). When thread_function () is returned, the new thread terminates. In this example, the thread function does not implement a large function. It only sets "thread says hi! "Output 20 times and then exit. Note that thread_function () accepts void * as the parameter, and the return value type is also void *. This indicates that void * can be used to transmit any type of data to the new thread, and any type of data can be returned when the new thread completes. So how to pass an arbitrary parameter to the thread? Very simple. Use pthread_create ()
The fourth parameter. In this example, because there is no need to pass any data to the insignificant thread_function (), set the fourth parameter to null.
You may have guessed that after pthread_create () returns successfully, the program will contain two threads. Wait, two threads? Didn't we create only one thread? Yes, we only created one process. However, the main program is also a thread. It can be understood as follows: if a program is not written using a POSIX thread at all, the program is a single thread (this single thread is called a "Main" Thread ). After a new thread is created, the program has two threads in total.
I think you have at least two important questions. The first question is how the main thread runs after the new thread is created. The main thread continues to execute the next program in sequence (in this example, if (pthread_join (...))").
The second problem is how to deal with the new thread end. The answer is that the new thread stops first and then waits for merging or "connecting" with another thread as part of its cleaning process ".
Now let's take a look at pthread_join (). Just as pthread_create () Splits a thread into two, pthread_join () merges the two threads into one. The first parameter of pthread_join () is TID mythread. The second parameter is the pointer to the void pointer. If the void pointer is not null, pthread_join places the void * return value of the thread at the specified position. Because we do not need to care about the return value of thread_function (), we set it to null.
It takes 20 seconds to complete thread_function. The main thread has called pthread_join () long before thread_function () ends (). In this case, the main thread will interrupt (turn to sleep) and wait until thread_function () is completed. When thread_function () is completed, pthread_join () returns. At this time, the program has only one main thread. When the program exits, all new threads have been merged using pthread_join. This is how to process each new thread created in the program. If a new thread is not merged, the maximum number of threads in the system is still limited. This means that if the thread is not properly cleaned
Pthread_create () call failed.
No parent, no child
If you have used fork () system call, you may be familiar with the concept of parent and child processes. When fork () is used to create another new process, the new process is a child process and the original process is a parent process.
This creates potentially useful hierarchies, especially when the child process is terminated. For example, the waitpid () function causes the current process to wait for all sub-processes to terminate. Waitpid () is used to implement a simple cleaning process in the parent process.
POSIX Threads are more interesting. You may have noticed that I have been intentionally avoiding the use of "parent thread" and "subthread. This is because this level does not exist in POSIX Threads. Although the main thread can create a new thread, the new thread can create another new thread. POSIX thread standards regard them as equivalent layers. So the concept of waiting for the sub-thread to exit is meaningless here. POSIX thread standards do not record any "family" information. The lack of family information has a major implication: If you want to wait for a thread to terminate, you must pass the thread TID to pthread_join (). The thread library cannot determine for you
TID.
This is not good news for most developers because it will complicate programs with multiple threads. But don't worry about it. POSIX thread standards provide all the tools required to effectively manage multiple threads. In fact, the fact that there is no parent/child relationship opens up more creative ways to use threads in programs. For example, if a thread is called thread 1 and thread 1 creates a thread called thread 2, thread 1 itself does not need to call pthread_join () to merge thread 2, any other thread in the program can do this. When writing a large number of code using threads, this may allow interesting things. For example, you can create a global "Dead thread list" that contains all stopped threads, and then add a dedicated thread to the list. This cleanup thread calls
Pthread_join () combines the threads that have just been stopped with itself. Now, only one thread is used to skillfully and effectively process all cleanup operations.
Synchronous roaming
Now let's take a look at some code that has done unexpected things. The code for thread2.c is as follows:
Thread2.c
#include <pthread.h>#include <stdlib.h>#include <unistd.h>#include <stdio.h> int myglobal; void *thread_function(void *arg) { int i,j; for ( i=0; i<20; i++) { j=myglobal; j=j+1; printf("."); fflush(stdout); sleep(1); myglobal=j; } return NULL;} int main(void) { pthread_t mythread; int i; if ( pthread_create( &mythread, NULL, thread_function, NULL) ) { printf("error creating thread."); abort(); } for ( i=0; i<20; i++) { myglobal=myglobal+1; printf("o"); fflush(stdout); sleep(1); } if ( pthread_join ( mythread, NULL ) ) { printf("error joining thread."); abort(); } printf("\nmyglobal equals %d\n",myglobal); exit(0);} |
Understanding thread2.c
Like the first program, this program creates a new thread. Both the main thread and the new thread add the global variable myglobal 20 times. But the program itself produces some unexpected results. Enter:
$ gcc thread2.c -o thread2 -lpthread |
Enter:
Output:
$ ./thread2..o.o.o.o.oo.o.o.o.o.o.o.o.o.o..o.o.o.o.omyglobal equals 21 |
Very unexpected! Because myglobal starts from scratch, the main thread and the new thread each add one for 20 times, and the myglobal value should be equal to 40 at the end of the program. The output result of myglobal is 21, which must be incorrect. But what exactly is it?
Give up? Okay. Let me explain what it is. First, check the thread_function () function (). Note how to copy myglobal to the local variable "J? Then add J plus one, sleep for another second, and then copy the new J value to myglobal? This is the key. Imagine what would happen if the main thread immediately added myglobal after the new thread copied the myglobal value to J? When thread_function () writes the J value back to myglobal, it overwrites the modifications made by the main thread.
When writing a thread program, avoid this useless side effect; otherwise, it will only waste time (of course, except for writing articles about POSIX Threads ). So how can this problem be ruled out?
Because myglobal is copied to J and cannot be written back after one second, you can try to avoid using temporary local variables and add myglobal directly. Although this solution applies to this particular example, it is still incorrect. If we perform relatively complex mathematical operations on myglobal, rather than simply adding one, this method will become ineffective. But why?
To understand this problem, you must remember that the thread runs concurrently. Even running on a single processor system (the kernel uses time slice to simulate multiple tasks) is acceptable. From the programmer's point of view, imagine that two threads are executed simultaneously. Thread2.c is faulty because thread_function () relies on the following arguments: myglobal will not be modified for about one second before myglobal 1. Some ways are required for a thread to notify other threads "Do not approach" when making changes to myglobal ". I will explain how to do this in the next article.