The specific bugs of the non-obvious multithreaded programming

Last Update:2016-12-02 Source: Internet

Author: User

Tags mutex syslog volatile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

We all know that when we write multithreaded programming, we should remember a lot of details, such as locks, the use of thread-safe libraries, and so on. Here is a list of less obvious bugs , specific to multithreaded programs. Many of them are not mentioned in the beginner's documentation or tutorials, but I think everyone who uses threads will eventually get shot.

Using the thead safe system function

Not all system functions, or library functions, can be used safely. One of the most obvious examples is strtok (3), which performs string symbolization. It returns the next tokenin each invocation and uses the global state to maintain the current position in the source string. When you read this function on the hand pages,

You will see a thread-safe version:strtok_r (3) with additional parameters: A pointer to a state variable instead of a global variable. Other examples of this feature are:

MBSTOWCS (3) Replace with MBSRTOWCS (3)
LocalTime (3) Replace with Localtime_r (3)
gethostbyname (3) Replace with Gethostbyname_r (3) or better getaddrinfo (3)
RAND (3) replaces with Random_r (3)

Using a variable that is not protected by a mutex, thevolatile keyword is misleading

You might think you're just using a shared "simple" variable, such as a Boolean variable without a mutex.

1 BOOL false ; 2  3  while (! stop) {4         sleep (1); 5 }

It is not possible for the above code to be interrupted by setting the stop variable to true if it is turned on for compilation optimization. This is because the compiler is free to apply optimizations: One reason is that when the compiler discovers that the variable is not being modified in the loop, it can omit the while condition. Another reason is that, depending on the architecture of the system,

This change in memory may not be noticed by other processors. In the first case, when debugging a database application, there was a time when a local variable was initialized in a process, and then the variable was balabala by a lot of operations. The final test found that the results were incorrect,

The default value that the variable has been in the uninitialized state since the second debug. It was a time when I thought that it would not be the assignment to the variable that didn't work, and finally no, I tried to advance the position of the variable, the result is good ... At this point I suddenly realized that it was probably the result of a compilation optimization problem.

This is due to the compilation optimization caused by the bug is still very laborious.

The volatile keyword is sometimes considered a solution, but it is not thread-independent . This keyword is intended for the underlying code (such as a device driver), just to ensure that memory is written to the device, and so on. It does not do what we need in a multithreaded process: it cannot make changes in the contents of memory visible to other processors.

It may be possible on some architectures, but should not be used in this way.

The right solution is to use a mutex when accessing a stop variable, even if it is such a "simple" memory access.

Two closures and use of invalid file descriptors

Consider the following code snippet:　　

1FD = open ("file", o_rdonly);2 if(FD <0) Exit (1);3  4  while(res = read (FD, BUF,sizeof(BUF)))) {5     if(Res <0) {6 Close (FD);7fprintf (stderr,"Read error!\n");8          Break;9     }Ten     Else { Oneprintf ("Read%zd bytes\n", res); A     } - } -   theClose (FD);

What's the problem? In a single-threaded program, it works correctly, even if a bug exists: in the case of a read error on line 4th, the file descriptor will be closed two times -Close (2) of the first line will return only one error that will be ignored. However, using this code in a multithreaded program can get you into trouble,

It's usually annoying. Why? Because the second close (3) of line 15th may not fail. There is race condition: If another thread opens a file between the first and second close (3) or creates a socket and obtains the same FD, the thread will close it.

Be aware that file descriptors are shared between threads of the same process. Turning off FD for other threads may not be the worst possible scenario, just think: If you try to write before the second close () of the preceding code, this will cause the file or TCP connection to be written to the other thread!

Two shutdowns are one of the most difficult bugs that can occur in multiple threads. Because this race condition rarely reproduce and the results are usually very strange errors. As a workaround: It is recommended to always check the return value of each close (3). But usually not checked in the program, especially when FD is only used to

Read the case of the file, of course, this must first read the document will not fail. If we log records of each close (3) failure, we can find this bug before race condition occurs. In most cases, the second close (3) is more likely to fail than to close other threads of FD.

Exception not caught

An uncaught exception causes the process to exit and display an error message. When you write a multi-process network daemon program, such an error terminates a process and the correctly written program will re-generate the error. When such a daemon is converted to a multithreaded design, uncaught exceptions are more dangerous:

Because it will kill the entire program, not just a thread. So you have to keep this in mind and catch all exceptions somewhere in the top-level code, even in the following way:

1 Try 2 3 Catch (...)      )4 {log ("Unknown exception")}

catch (...) Instead of re-throwing the exception is a bad practice, but at least the program can still handle the remaining client requests. This may be the only catch (...) The situation.

Using the fork () system call

on the multi-threaded process with fork () things, I'll summarize the following articles, or see the instructions for using the o_cloexec tag for open (2) and Dup3 (2) . But basically: there is no secure way to use fork () in a multithreaded process,

and doing more than just executing execve () in a subprocess. Because you cannot know what other threads are doing when the fork () is called, some of the mutexes may have been held by some threads, some threads may be modifying some complex data in the process, and so on.

Performing IO operations while the mutex is in a locked state

Here is a performance tip: Avoid I/O operations while holding mutexes. At a minimum, avoid I/O operations, preferably by avoiding any system calls or even library calls in case the mutex is locked.

Trust me: You don't want a thread that processes at least thousands of requests per second in a very busy network daemon process to wait for a thread that happens to write some error messages through a syslog (3) system call with a mutex. The mutex is used only to synchronize access to memory,

 1  pthread_mutex_lock (&mutex);  2  if  (freeslots = = 0  ) { 3  syslog (log_err, "  no slots available, rejecting request   "  4 } else   {  5  freeslots--;  6  }  7  Pthread_mutex_unlock (&mutex);

At Syslog (3) invocation, the mutex is already in the holding state. Depending on the configuration of the syslog daemon and the load on the machine, this may even take dozens of or hundreds of milliseconds to complete when fsync () is executed after each log line. So you just need to unlock the mutex before logging.

This allows other threads to run without waiting for I/O to complete.

Recommendation: Wrapping a Mutex class

If you are using the C + + language, do not use the POSIX mutexes function directly. It is much easier to create a mutex class so that you can obtain a lock in the constructor and release the lock in the destructor. This method simply creates an automatic variable for the class, but it gets the lock in the constructor,

and is automatically unlocked at the end of the scope of the code because of the destructor. An example of this kind is the scoped_lock in the Boost library.

The specific bugs of the non-obvious multithreaded programming

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More