Use reentrant functions for safer Signal Processing

Source: Internet
Author: User
Tags signal handler

If you want to perform concurrent access to a function, whether through a thread or through a process, you may encounter problems caused by the unavailability of the function. In this article, we use the sample code to find out what exceptions will occur if the reusability is not guaranteed. Pay special attention to the signal. Five practical programming experiences are introduced, and the proposed compiler model is discussed. In this model, the reusability is handled by the compiler front-end.

In early programming, non-reentrant does not pose a threat to programmers; functions do not have concurrent access and are not interrupted. In many older C language implementations, functions are considered to run in a single-threaded process environment.

However, concurrent programming is now widely used and you need to be aware of this defect. This article describes some potential problems caused by non-reentrant functions in parallel and concurrent programming. The generation and processing of signals increase extra complexity. Because the signal is asynchronous in nature, it is difficult to find out the bug that occurs when the signal processing function triggers a non-reentrant function.

This article:

  • Defines the reusability and contains a POSIX list of reentrant functions.
  • An example is provided to illustrate the problem caused by non-reentrant.
  • The method to ensure the reusability of underlying functions is pointed out.
  • The reentrant processing at the compiler level is discussed.

What is reentrant?
Reentrant)Functions can be used concurrently by more than one task without worrying about data errors. On the contrary,Non-reentrant)A function cannot be shared by more than one task, unless it can ensure that the function is mutually exclusive (or use semaphores, or disable interruptions in key parts of the Code ). The reentrant function can be interrupted at any time and run later without data loss. The reentrant function either uses local variables or protects its own data when using global variables.

Reentrant functions:

  • Do not hold static data for continuous calls.
  • No pointer to static data is returned. All data is provided by the function caller.
  • Use local data or make local copies of global data to protect global data.
  • Never call any non-reentrant function.

Do not confuse reentrant with thread security. In the programmer's opinion, this is two independent concepts: functions can be reentrant, thread-safe, or both, or both. Non-reentrant functions cannot be used by multiple threads. In addition, it may be impossible for a function to be reentrant to be thread-safe.

IEEE Std 1003.1 lists 118 reentrant UNIX functions without copies. SeeReferencesLink to this list on unix.org.

For any of the following reasons, other functions cannot be reentrant:

  • They calledmallocOrfree.
  • We all know that they use static data structures.
  • They are part of the standard I/O library.

Signal and non-reentrant Functions
Signal (signal)Software interruption. It allows programmers to process asynchronous events. To send a signal to a process, the kernel sets a single bit in the signal field of the entry, corresponding to the type of the received signal. The ansi c prototype of the signal function is:

void (*signal (int sigNum, void (*sigHandler)(int))) (int);

Or, another form of description:

typedef void sigHandler(int);SigHandler *signal(int, sigHandler *);

When a process processes the captured signal, the normal command sequence being executed will be temporarily interrupted by the signal processor. Then the process continues to execute, but now the commands in the signal processor are executed. If the signal processor returns, the process continues to execute the normal command sequence that is being executed when the signal is captured.

Now, in the signal processor, you do not know what the process is executing when the signal is captured. If the process is usingmallocWhen you allocate additional memory on the heapmallocWhat will happen? Or, a function that is processing the global data structure is called, and the same function is called in the signal processor. If you callmalloc, The process will be seriously damaged, becausemallocA linked list is usually maintained for all the areas it is allocated, and it may be modifying the linked list.

An interrupt can even be sent between the start and end of the C operator that requires multiple commands. In the programmer's opinion, commands may seem atomic (that is, they cannot be divided into smaller operations), but they may need more than one processor command to complete the operation. For example, check the C code:

temp += 1;

On the x86 processor, the statement may be compiled:

mov ax,[temp]inc axmov [temp],ax

This is obviously not an atomic operation.

This example shows what may happen when a signal processor is running when a variable is modified:

Listing 1. Running a signal processor while modifying a variable

#include <signal.h>#include <stdio.h>struct two_int { int a, b; } data;void signal_handler(int signum){   printf ("%d, %d\n", data.a, data.b);   alarm (1);}int main (void){ static struct two_int zeros = { 0, 0 }, ones = { 1, 1 }; signal (SIGALRM, signal_handler); data = zeros; alarm (1);while (1)  {data = zeros; data = ones;}}

This programdataFill 0, 1, 0, and always alternate. At the same time, the alarm signal processor prints the current content every second (called in the processor)printfIt is safe. When a signal occurs, it is indeed not called outside the processor ). What kind of output do you expect from this program? It should print or. However, the actual output is as follows:

0, 01, 1(Skipping some output...)0, 11, 11, 01, 0...

On most machinesdataTo store a new value, you need several commands to store one word each time. If a signal is sent during these commands, the processor may finddata.a0 anddata.bIs 1, or vice versa. On the other hand, if the machine that runs the code can store the value of an object in an uninterruptible command, the processor will always print 0, 0, or 1.

Another difficulty in using signals is that running test cases alone cannot ensure that the Code has no signal bugs. The reason for this difficulty is that the signal generation is essentially asynchronous.

Non-reentrant functions and static variables
Assume that the signal processor uses a non-reentrantgethostbyname. This function returns its value to a static object:

static struct hostent host; /* result stored here*/

It re-uses the same object every time. In the following example, if the signal ismainCallinggethostbynameThe value that the program still uses, or even arrives after the call, destroys the value requested by the program.

Listing 2. Dangerous usage of gethostbyname

main(){  struct hostent *hostPtr;  ...  signal(SIGALRM, sig_handler);  ...  hostPtr = gethostbyname(hostNameOne);  ...}void sig_handler(){  struct hostent *hostPtr;  ...  /* call to gethostbyname may clobber the value stored during the call  inside the main() */  hostPtr = gethostbyname(hostNameTwo);  ...}

However, if the program is not usedgethostbynameOr any other function that returns information in the same object, or if it blocks the signal every time it is used, it is safe.

Many library functions return values in fixed objects and always use the same object. They all cause the same problem. If a function uses and modifies an object you provide, it may not be reentrant. If two calls use the same object, they will interfere with each other.

When stream is used for I/O, a similar situation occurs. Assume that the signal processor is usedfprintfPrint a message, and the program is using the same stream when the signal is sentfprintfCall. Both the message and program data of the signal processor are damaged because the two calls operate on the same data structure: the stream itself.

Using third-party libraries makes things more complex, because you never know which library is reentrant and which part is not reentrant. For standard libraries, many library functions return values in fixed objects and always reuse the same object, which makes those functions not reentrant.

Recently, many providers have begun to provide reentrant versions of Standard C libraries. This is good news. For any given library, you should read the documentation provided by it to see if the usage of its prototype and standard library functions has changed.

Experience in ensuring reusability
Understanding these five best experiences will help you maintain program reusability.

Experience 1
Returning a pointer to static data may cause the function to be reentrant. For example, convert a string to uppercasestrToUpperFunctions may be implemented as follows:

Listing 3. reentrant version of strToUpper

char *strToUpper(char *str){        /*Returning pointer to static data makes it non-reentrant */       static char buffer[STRING_SIZE_LIMIT];       int index;       for (index = 0; str[index]; index++)                buffer[index] = toupper(str[index]);       buffer[index] = '\0';       return buffer;}

By modifying the function prototype, You can implement the reusable version of this function. The following list provides storage space for the output:

Listing 4. reentrant version of strToUpper

char *strToUpper_r(char *in_str, char *out_str){        int index;        for (index = 0; in_str[index] != '\0'; index++)        out_str[index] = toupper(in_str[index]);        out_str[index] = '\0';        return out_str;}

The prepared output bucket of the called function ensures the reusability of the function. Note that the standard practice is followed here. You can name the reentrant function by adding the "_ r" suffix to the function name.

Experience 2
The state of the stored data will make the function unreentrant. Different threads may call that function successively and will not notify other threads that are using this data when modifying the data. If a function needs to maintain the state of some data during a series of calls, such as a working cache or pointer, the caller should provide this data.

In the following example, the function returns the consecutive lowercase letters of a string. The string is provided only during the first call, as shown in figurestrtokSubroutine. When the end of the string is searched, the function returns\0. Functions may be implemented as follows:

Listing 5. Non-reentrant versions of getLowercaseChar

char getLowercaseChar(char *str){        static char *buffer;        static int index;        char c = '\0';        /* stores the working string on first call only */        if (string != NULL) {                buffer = str;                index = 0;        }        /* searches a lowercase character */        while(c=buff[index]){         if(islower(c))         {             index++;             break;         }        index++;       }      return c;}

This function cannot be reentrant because it stores the state of the variable. To make it reentrant, static data, that isindex, Which must be maintained by the caller. The reentrant version of this function may be similar to the following implementation:

Listing 6. reentrant versions of getLowercaseChar

char getLowercaseChar_r(char *str, int *pIndex){        char c = '\0';        /* no initialization - the caller should have done it */        /* searches a lowercase character */       while(c=buff[*pIndex]){          if(islower(c))          {             (*pIndex)++; break;          }       (*pIndex)++;       }         return c;}

Experience 3
In most systems,mallocAndfreeThey are not reentrant because they use static data structures to record which memory blocks are idle. In fact, any library function that allocates or releases memory cannot be reinjected. This also includes functions for allocating storage results.

The best way to avoid memory allocation on the processor is to pre-allocate the memory to be used for the signal processor. The best way to avoid releasing memory in the processor is to mark or record the objects to be released, so that the program can continuously check whether there is memory waiting for release. But this must be done with caution, because adding an object to a chain is not an atomic operation. If it is interrupted by another signal processor that does the same action, an object will be lost. However, if you know that when the signal is possible, the program cannot use the stream used by the processor at that time, it is safe. If the program uses some other streams, there will be no problems.

Experience 4
In order to write code without Bugs, take special care to handle global variables within the process scope, sucherrnoAndh_errno. Consider the following code:

Listing 7. Dangerous usage of errno

if (close(fd) < 0) {  fprintf(stderr, "Error in close, errno: %d", errno);  exit(1);}

Assume that the signal iscloseSystem Call settingserrnoThe variable is generated in this extremely small time segment before it returns. The generated signal may change.errnoProgram behavior is unpredictable.

Save and restore in the signal processor as follows:errnoTo solve this problem:

Listing 8. saving and restoring errno values

void signalHandler(int signo){  int errno_saved;  /* Save the error no. */  errno_saved = errno;  /* Let the signal handler complete its job */  ...  ...  /* Restore the errno*/  errno = errno_saved;}

Experience 5
If the underlying function is in the key part and generates and processes signals, this may cause the function to be reentrant. By using signal settings and signal masks, key areas of the code can be protected from a specific set of signals, as shown below:

  1. Save the current signal settings.
  2. Use unnecessary signals to shield signal settings.
  3. Make the key part of the code complete its work.
  4. Finally, reset the signal settings.

The following is an overview of this method:

Listing 9. Use signal settings and signal mask

sigset_t newmask, oldmask, zeromask;.../* Register the signal handler */signal(SIGALRM, sig_handler);/* Initialize the signal sets */sigemtyset(&newmask); sigemtyset(&zeromask);/* Add the signal to the set */sigaddset(&newmask, SIGALRM);/* Block SIGALRM and save current signal mask in set variable 'oldmask'*/sigprocmask(SIG_BLOCK, &newmask, &oldmask);/* The protected code goes here......*//* Now allow all signals and pause */sigsuspend(&zeromask);/* Resume to the original signal mask */sigprocmask(SIG_SETMASK, &oldmask, NULL);/* Continue with other parts of the code */

Ignoresigsuspend(&zeromask);It may cause problems. From the elimination of signal congestion to the execution of the next command by the process, there must be a clock cycle gap, and any signal generated in this time window will be lost. Function callsigsuspendThis problem is solved by resetting the signal mask and having the process sleep into a single atomic operation. If you can ensure that the signal generated in this time window does not have any negative impact, you can ignoresigsuspendAnd directly reset the signal.

Processing reusability at the compiler Layer
I will propose a model for processing reentrant functions at the compiler level. A New Keyword can be introduced for advanced languages:reentrant, The function can be specifiedreentrantIdentifier to ensure that the function can be reentrant, for example:

reentrant int foo();

This indicator instructs the compiler to specifically process that special function. The compiler can store this indicator in its symbol table and use it in the intermediate code generation phase. For this purpose, the front-end design of the compiler needs to be changed. This reentrant indicator follows these guidelines:

  1. Do not hold static data for continuous calls.
  2. Global data is protected by making local copies of global data.
  3. Do not call non-reentrant functions.
  4. No reference to static data is returned. All data is provided by the function caller.

Criterion 1 can be ensured through the type check. If there is any static storage declaration in the function, an error message is thrown. This can be completed in the compiled syntax analysis phase.

Criterion 2: global data protection can be ensured in two ways. The basic method is to throw an error message if the function modifies global data. A more complex technique is to generate intermediate code without the destruction of global data. You can implement a method similar to experience 4 at the compiler layer. When entering the function, the compiler can use the temporary name generated by the compiler to store the global data to be operated, and then restore the data when exiting the function. Using a temporary name generated by the compiler to store data is a common method for the compiler.

Ensure that criterion 3 is met and that the compiler knows in advance all reentrant functions, including the libraries used by the application. The additional information about functions can be stored in the symbol table.

Finally, Criterion 4 is guaranteed by Criterion 2. If the function does not have static data, there is no reference to the returned static data.

The proposed model simplifies the process for programmers to follow the reentrant function rules, and can be used to prevent unintentional reentrant bugs in code.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.