Deep understanding of System () functions under Linux (Collation)

Source: Internet
Author: User
Tags define function signal handler

These days tuning program (embedded Linux), the discovery program sometimes inexplicably die, each time is located in the program in the different system () function, directly under the shell Input system () function called in the command is also all normal. I didn't get this bug, Think that the other code affects this, or the kernel driver file system What the exception caused, yesterday there was a problem, on the point of hundred degrees, the problem arose, many people say system () function to use sparingly to be able to do without, system () function is not stable?   under the system function to do a simple introduction:  header file   #i nclude  define function   int System (const char * string); The   function Description   system () calls fork () to produce a child process that calls/bin/sh-c string to execute the command represented by the argument string string, which returns the previously invoked process after execution. The SIGCHLD signal is temporarily shelved while the system () is being called, and the SIGINT and sigquit signals are ignored. return value =-1: Error = 0: The call succeeded but no child process >0: The ID of the child process that successfully exited if system () fails on call to/bin/sh returns 127, other failure causes return-1. If the argument string is a null pointer (NULL), a non-0 value of > is returned. If the system () call succeeds, the return value after executing the shell command is returned, but this return value may also be 127 of the return of the system () call to/bin/sh failure, so it is better to check the errno again to confirm that the execution was successful.   Additional instructions    do not use System () when writing programs with Suid/sgid permissions, System () Inherits environment variables, and may cause system security problems through environment variables. The system function has already been included in the standard C library and can be called directly, using the basic method of using the systems () function to invoke the following:  #include  int main () { system ("mkdir $HOME/ . smartplatform/");  system (" MkDir $HOME/. Smartplatform/files/");  system (" CP mainnew.cpp $HOME/. Smartplatform/files/");  return 0; }   Let's take a look at the source:  of the system function #include   #include  # Include   #include  int system (const char * cmdstring) { pid_t pid; inT Status; if (cmdstring = = NULL) { return (1);  } if ((pid = fork ()) <0) { status = -1; }& Nbsp;else if (PID = 0) { execl ("/bin/sh", "sh", "-C", Cmdstring, (char *) 0);-exit (127);//Sub-process normal execution will not execute this statement} else { while (Waitpid (PID, &status, 0) < 0) { if (errno! = einter) {status =-1; break; } } }& Nbsp;return status; }  spent two days to study carefully, found on the Internet a boutique blog, the introduction of a very detailed, thank you blogger, direct transfer, the original text as follows:  http://my.oschina.net/ renhc/blog/53580  Use the system () function under the C + + + Linux to be careful   Once, it was tortured by the system () function, because the system () function was not well understood. Simply knowing that using this function to execute a system command is not enough, it is not sufficient, its return value, the return value of the command it executes, and the reason for the failure of the command execution, which is the point. Originally because of this function risk is more, so abandon not use, use other method. Let's not say what I'm using here, it's important to understand the system () function, because there are still a lot of people using the system () function, and sometimes you have to face it.   First look at a brief introduction to the system () function:  #include  int system (const char *command);  system () Executes a command The specified in command is CALLING/BIN/SH-C command, and returns after the command has been completed. During execution of the command, SIGCHLD would be blocked, and SIGINT and Sigquit would be ignored. The system () function calls/bin/sh to execute the command specified by the parameter,/bin/sh is typically a soft connection, pointing to a specific shell, such as the BASH,-C option, which tells the Shell to read the command from the String command, and during the command execution, SIGCHLD is blocked, like saying: Hi, kernel, this will not send me sigchld signal, and so on I am busy to say; During the command execution, SIGINT and sigquit are ignored, meaning that the process receives both signals without any action.   Take a second look at the system () function return value:  the value returned is-1 on error (e.g. fork (2) failed), and the return status of the comma nd otherwise. This latter return status was in the format specified in Wait (2). Thus, the exit code of the command would be Wexitstatus (status). In CAsE/bin/sh could not being executed, the exit status would be, the a command that does exit (127). If the value of command is NULL, System () returns nonzero if the shell is available, and zero if not.  in order to better understand system () function return value, need to understand its execution process, actually the system () function performs three steps:  1.fork a subprocess;  2. Call the EXEC function in the child process to execute the command; 3. Call wait in the parent process to wait for the child process to end. For fork failure, the system () function returns-1. If exec executes successfully, that is, command executes successfully, returns the value returned by command via exit or return. (Note that command smooth execution does not mean execution succeeds, such as command: "rm debuglog.txt", regardless of whether the file does not exist, the command is executed successfully) if Exec fails, that is, command is not executed smoothly, such as by signal interruption, or command commands do not exist at all, the system () function returns 127. If command is NULL, the system () function returns a value other than 0, typically 1.  look at the source code of the system () function   Read these, I want to be sure that someone on the system () function return value is unclear, see the source is the clearest, The implementation of a system () function is given below:  int system (const char * cmdstring)  { pid_t pid; int status; if ( cmdstring = = NULL)  { return (1),//If cmdstring is empty, returns a value other than 0, typically 1 } if ((PID = fork ()) <0)  {   status =-1; Fork fails, return -1 } else if (pid = = 0)  { execl ("/bin/sh", "sh", "-C", CmdstriNg, (char *) 0);  _exit (127); Exec execution Failure Returns 127, note that EXEC only returns to the current process if it fails, and if successful, the current process does not exist. ~~ } else//Parent Process  { while (Waitpid (PID, & Status, 0) < 0)  { if (errno! = eintr)  { status = 1;//If the waitpid is interrupted by signal, return -1 break; }& nbsp;}  } return status; If Waitpid succeeds, it returns the return status of the child process  }   carefully read the simple implementation of the system () function, then the return value of the function is clear, so when does the system () function return 0? Returns 0 o'clock only in command commands. Take a look at how to monitor the system () function execution State Here's what I do:  int status; if (NULL = = cmdstring)//If the cmdstring is empty before you flash back, even though system () The function can also handle null pointers  { return Xxx; } status = System (cmdstring);  if (Status < 0)  { printf ("CMD:%s\t Error:%s", Cmdstring, Strerror (errno)); It is important that the errno information is output or credited to Log return xxx; }  if (status)  { printf ("Normal termination, exit status =%d\n ", Wexitstatus (status)); Get cmdstring Execution Result     } else if (wifsignaled (status))  { printf ("abnormal Termination,signal number =%d\n ", WTErmsig (status)); If the cmdstring is signaled, get the signal value  } else if (wifstopped (status))  { printf ("Process stopped, signal Number =%d\n ", Wstopsig (status)); If the cmdstring is paused for signal execution, get the signal value  }   for an introduction to the return value of the child process, refer to another article: http://my.oschina.net/renhc/blog/35116 The system () function is easily error-prone, returns too many values, and the return value can easily be confused with the command's return value. It is recommended to use the Popen () function instead, and the simple use of the Popen () function can also be viewed through the links above. The advantage of the  popen () function over the system () function is that the simple, popen () function returns only two values: the status of the child process is returned successfully, and the command's return result is obtained using the wifexited related macro; failure returns-1, We can use the Perro () function or the strerror () function to get useful error information. This article deals only with the simple use of the system () function, and does not talk about the effects of SIGCHLD, SIGINT, and Sigquit on system () functions, and in fact, this article was written today because the system was used by someone in the project () The function caused a very serious accident. Now, as the system () function executes, an error occurs: "No child Processes". About this error analysis, interested friends can see: http://my.oschina.net/renhc/blog/54582 2012-04-14 [email protected] Reprint please indicate the source.     here is the second one, for a detailed analysis of the error of the system () function, thanks again to the blogger   the error raised by the system () function under the C + + + Linux   Today, a program that has been running for nearly a year suddenly hangs up and the problem is fixed to the system () function, and the simple use of the function is described in my last article: http://my.oschina.net/renhc/blog/53580   First look at the problem simply encapsulates the system () function:  int pox_system (const char *cmd_line)  { return system (Cmd_line);  }  function call:  int ret = 0; ret = Pox_system ("gzip-c/var/opt/i00005.xml >/var/opt/i00005.z");   if (0! = ret)   {  log ("Zip file failed\n");  }  problem phenomenon: Every time it executes here, it will zip failed. It is always right to take the command out of the shell and execute it in the shells, in fact the code has been running for a long time and has never had a problem.   Bad log   Analyze log, we can only see "Zip file Failed" This our custom information, as to why fail, no clue. Well, let's try to find more clues:   int ret = 0;  ret = Pox_system ("Gzip-c/var/opt/i00005.xml >/var/opt/ I00005.z ");   if (0 = ret)   {  log (" Zip file failed:%s\n ", Strerror (errno));// Try to print out the system error message   }  We added log, through the system () function set errno, we get a very useful clue: the system () function failed due to "No child processes". Continue looking for root cause.   Who moved errno  we know from the above clue that the system () function has set errno to Echild, but we can't find it from the man Manual of the system () function.Any information about the Ehild. We know that the system () function executes as follows: Fork ()->exec ()->waitpid () .  It's obvious that waitpid () is a big suspect, so let's check the man manual to see if it's possible to set the Echild: Echild (for Waitpid () or Waitid ()), the process specified by PID (Waitpid ()) or Idtype and ID (Waitid ()) does not exist or Is isn't a child of the calling process. (This can happen for one's own child if the action for SIGCHLD are set to Sig_ign. See also the Linux Notes sections about threads.)   Sure enough, if the SIGCHLD signal behavior is set to Sig_ign, the waitpid () function might report a echild error because the child process could not be found. It seems that we have found a solution to the problem: Reset the SIGCHLD signal to the default value before calling the system () function, which is signal (SIGCHLD, SIG_DFL). We are excited to take a look at the Linux notes section and add code tests directly! Sweetie, the problem is solved!   Is it your style to deal with this problem? As we rush to check in the code, a question arises: "Why didn't this error have happened before"? Yes, well-run programs suddenly hang up? First of all, our code has not changed, then it must be an external factor. At the thought of external factors, we began to complain: "Certainly the other group of programs affect us!" "But complaining this is useless, if you think so, then please take out the evidence!" But static analysis is not difficult to find that this can not be the impact of other programs, other processes can not affect the way we process the signal processing. The system () function did not go wrong before, because the Systeme () function relies on one of the characteristics of the systems, that is, the kernel initialization process when the SIGCHLD signal processing method is SIG_DFL, what does this mean? That is, the kernel discovers the process the child process terminates after sends a SIGCHLD signal to the process, the process receives this signal to use the SIG_DFL way processing, then SIG_DFL is what way? SIG_DFL is a macro that defines a signal handler function pointer, in fact what the signal processing functionDidn't do it either. This is exactly what the system () function requires, and the system () function first fork () a child process to execute a command command, and after execution the system () function uses the Waitpid () function to process the child. Through the above analysis, we can be aware that the system () before the implementation of the SIGCHLD signal processing method must have changed, no longer is SIG_DFL, as to what becomes temporarily do not know, in fact, we do not need to know, we just need to remember to use System () function before the SIGCHLD signal processing mode is explicitly modified to SIG_DFL mode, while recording the original processing mode, using the system () and then set to the original processing mode. This allows us to block the impact of changes in system upgrades or signal processing patterns. Verify that our company uses the continuous integration + Agile development model, each day by the dedicated team responsible for automating case testing, each time called a build, we analyzed the build and the last build using the system version, found that the version is indeed upgraded. So we found the relevant team to verify, we put the problem described in detail, and soon the other gave feedback, the following is the original message reply: Libgen new added SIGCHLD processing. to ignore it. To avoid the generation of zombie processes. It seems our guess is right! Problem analysis Here, the solution is clear, so we modified our Pox_system () function:  typedef void (*sighandler_t) (int);  int pox_system (const char * Cmd_line)  {  int ret = 0; sighandler_t Old_handler; old_handler = Signal (SIGCHLD, SIG_DFL); & Nbsp; ret = System (Cmd_line);   signal (SIGCHLD, Old_handler);   return ret;  }   I think this is the perfect solution to call system (), while using the Pox_system () function encapsulation brings great maintainability, we just need to modify one of the functions here, and no other calls need to be changed at all. Later, looking at the other party's modified code, sure enough to find the answer from the code: &NBSP;&NBSP;IF (Signal (SIGCHLD, sig_ign) = = Sig_err)  {  return-1;  } else {  return 0;  } Other considerations Our company's code using the SVN process management, so far there are many branch, gradually, almost every branch has appeared above the problem, so I fix this problem one on each BRANCHC, almost busy a day, because some branch has been locked, Think of the merge code must find the relevant person to explain the seriousness of the problem, but also in different environments to test, I do these side to think, the system so upgrade appropriate? First of all, because the system upgrade caused our code in the test to find the problem, then hurried to fix, causing our passive, I think this is one of their mistakes. Does the upgrade you do have to take into account the impact on other team? What's more, you're doing a system upgrade. Before upgrading, you need to do a risk assessment, to inform everyone about the possible impact, so that professional. Furthermore, according to them, modifying the signal processing method is to avoid the zombie process, of course, the original intention is good, but such an upgrade affects the use of some functions, such as the system () function, wait () function, Waipid (), fork () function, these functions are related to the child process, If you want to use Wait () or waitpid () to process a child, you must use the method described above: SIGCHLD signal is set to SIG_DFL processing before the call (in fact, before the fork ()), after the call (in fact, wait ()/waitpid () And then set the signal processing mode to the previous value. Your system upgrades, forcing everyone to improve the code, does improve the quality of the code, but for this upgrade I do not quite agree, imagine, you have seen how many fork ()->waitpid () set the SIGCHLD signal before and after the code?   The suggestion on using the system () function gives a more secure use of calling the system () function, but using the system () function is still error-prone. That is the return value of the system () function, and the introduction to its return value is in the previous article. The system () function is sometimes convenient, but not abusive!  1, the recommended system () function is used only to execute shell commands, because in general, the system () return value is not 0 to indicate an error, &NBSP;2, it is recommended to monitor the system () function after the completion of the errno value, Give more useful information when trying to make mistakes;  3, it is recommended to consider the alternative function of the system () function Popen (); its usage in my otherArticle has been introduced.   [email protected] Reprint Please specify the source.     continue to go to the Cow x Blogger's blog for a detailed description of the replacement function Popen () for the system () function mentioned above ... Thank you very much, Bo Lord:  IPC Communication is based on the Popen and Pclose functions of the pipeline The   standard I/O library provides the Popen function, which initiates another process to execute a shell command line. Here we call the process calling Popen the parent process, and the process initiated by Popen is called a child process. The Popen function also creates a pipeline for inter-parent interprocess communication. The parent process either reads information from the pipeline, or writes information to the pipeline, whether it is read or write, depending on the parameters passed when the parent process calls Popen. Under the definition of Popen, pclose:  #include   file * popen (const char * command,const char * type);   int Pclose (file * stream);  The following example to see the use of Popen: if we want to get the number of files in the current directory, under the shell we can use:  ls | Wc-l we can write this in the program:   #include   #include   #include   #include    #define MAXLINE 1024   int Main ()  { char Result_buf[maxline], command[maxline]; int rc = 0;//for receiving command return value  file *fp; snprintf (command, sizeof), "LS./| Wc-l "), &NBSP;FP = popen (Command," R "),  if (NULL = = fp)  { perror (" Popen execution failed! ");  exit (1);  } while (fgets (result_buf, sizeof (RESULT_BUF), fp)! = NULL)  {  if (' \ n ' = = Result_buf[strlen (RESULT_BUF)-1])  { result_buf[strlen (RESULT_BUF)-1] = ' + ';  } printf ("command"%s "Output"%s "\ r \ n", command, Result_buf); &NBSP;}&NBSP;&NBSP;RC = Pclose (FP);  if ( -1 = = RC)  { perror ("Failed to close file pointer");   exit (1);  } else { printf ("command"%s "Child process end Status"%d "command return value"%d "\ r \ n", command, RC, Wexitstatus (RC)); }   return 0; }  compiled and executed:  $ gcc popen.c $./a.out command "ls./| Wc-l "  Output" 2 "command" ls./| Wc-l "Child process End State" 0 "command return value" 0 "  above Popen only captured the command's standard output, if command execution fails, the child process will print the error message to the standard error output, the parent process will not be able to obtain. For example, the command is "LS nofile.txt", in fact we do not nofile.txt this file at all, when the shell will output "Ls:nofile.txt:No such file or directory". This output is on the standard error output. It cannot be obtained through the above program. Note: If you set the command in the above program to "LS Nofile.txt", compile the execution program and you will see the following result:  $ gcc popen.c $./a.out ls:nofile.txt:No such file or di rectory  command "LS nofile.txt" child process End State "256" command return value "1" note that the first line of output is not the output of the parent process, but the standard error output of the child process. Sometimes the error message of a child process is useful, so how can the parent process get the error message of the child process? Here we can redirect the error output of the child process and redirect the error output to standard output (2>&1) so that the parent process can capture the error message of the child process. For example, command "LS nofile.txt 2>&1", output the following:  command "LS nofile.txt 2>&1" output "ls:nofile.txt:No such file or Directo Ry "&NBSP; command "LS nofile.txt 2>&1" sub-process End status "256" command return value "1"   attached: the termination state of the child process to determine the macro involved, set the process to the status of the termination state. Wifexited (status) is a non-0 value if the child process ends normally. Wexitstatus (status) Gets the end code returned by the child process exit (), typically using wifexited to determine whether the macro ends properly before it can be used. wifsignaled (status) This macro value is true if the child process ends because of a signal. Wtermsig (status) Gets the signal code that the child process aborts because of the signal, generally uses wifsignaled to judge before using this macro. wifstopped (status) This macro value is true if the child process is in a paused execution condition. This is generally the case only if you are using wuntraced. Wstopsig (status) Gets the signal code that causes the child process to pause, usually using wifstopped to judge before the macro is used. 2011-11-12 Ninhong  [email protected] Reprint please indicate the source.    But according to the blogger above, using the system () function before the SIGCHLD signal processing mode is explicitly modified to SIG_DFL mode, while recording the original processing mode, after using the system () and then set to the original processing mode, The program will still die. And you can't see what the system return value is (because the program has been hung up when the system command was executed), so temporarily use the Popen () function as the second solution that blogger mentions to replace the system () function. The modified function is as follows   int My_system (const char * cmd)  { file * fp; int res; char buf[1024]; if (cmd = = NULL)  { pri NTF ("My_system cmd is null!\n"),  return-1; } if ((fp = popen (cmd, "r")) = = NULL)  { perror ("Pop En ");  printf (" Popen Error:%s/n ", Strerror (errno)); Return-1; }&nBsp;else { while (fgets (buf, sizeof (BUF), FP))  { printf ("%s", buf);  } if ((res = Pclose (FP)) = =-1)  { printf ("Close popen file pointer fp error!\n"); return Res; } else if (res = = 0)  {  Return res; } else { printf ("Popen res is:%d\n", res); return res; } } }  Call My_system () to perform the function of the system function (My_system function is implemented using the Popen () function), test the day, no recurrence of the program suddenly die (before the modification of a continuous loop call system () function test, Every 10 times it causes the program to hang up at least once. Continuous non-stop calls). The above is my summary of this problem, first make a record, to fix the bug and then come back to study carefully.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.