1. Fork + Exec
Fork is used to create a child process. When a program calls the fork function, the system prepares the preceding three segments for a new process. First, the system allows the new process and the old process to use the same code segment, because their programs are the same, the system copies a copy of the data segment and stack segment to the new process. In this way, all data of the parent process can be left to the child process. However, once a child process starts running, it inherits all the data of the parent process, but in fact the data has been separated and there is no impact between them, that is, they no longer share any data. If the two processes want to share any data, they need to use another function (shmget, shmat, shmdt, and so on. Now there are two processes. For the parent process, the fork function returns the process Number of the subroutine, and for the subroutine, the fork function returns zero. In this way, for the program, as long as you determine the return value of the fork function, you will know whether you are in the parent process or child process.
As a matter of fact, most Unix systems are not actually copied in implementation. Generally, CPUs are allocated space in units of "pages", such as Intel's CPU. One page is usually 4 K bytes, both the Data Segment and the stack segment are composed of many "pages". The fork function copies these two segments, but they are logical and not physical. That is to say, when fork is actually executed, the data segments and stack segments of the two processes in the physical space are still shared. When a process writes data, at this time, the data between the two processes is different, and the system physically separates the different "pages. The space overhead of the system can be minimized.
Like fork, vfork also creates a sub-process, but it does not completely copy the address space of the parent process to the sub-process and does not copy the page table. Because the sub-process will immediately call exec, it will not store the address space. However, before the child process calls exec or exit, it runs in the space of the parent process.
Why is there vfork? Because the previous fork will create a new address space and copy the resources of the parent process when it creates a sub-process, the exec call is often executed in the sub-process. In this way, the previous copy operation is in vain. In this case, a smart person comes up with vfork, at the beginning, the child process generated by the child process temporarily shares the address space with the parent process (in fact, it is the concept of a thread), because the child process runs in the address space of the parent process, therefore, the child process cannot perform write operations, and when the son "occupies" Lao Tzu's house, he will be wronged and let him rest (blocking) outside ), once the son executes exec or exit, it is equivalent to buying his own house, and then it is equivalent to dividing the house.
Another difference between vfork and fork is that vfork ensures that the sub-process runs first. Only after she calls exec or exit can the parent process be scheduled to run. If the sub-process depends on the further action of the parent process before calling these two functions, a deadlock will occur.
It can be seen that this system call is used to start a new application. Second, the sub-process runs directly in the stack space of the parent process after vfork () returns, and uses the memory and data of the parent process. This means that the child process may damage the data structure or stack of the parent process, resulting in failure.
To avoid these problems, make sure that once vfork () is called, the sub-process will not return from the current stack framework, in addition, if a child process changes the data structure of the parent process, the exit function cannot be called. The child process must also avoid changing the global data structure or any information in the global variables, because these changes may make the parent process unable to continue.
Generally, if the application does not call exec () immediately after fork (), it is necessary to perform a careful check before fork () is replaced with vfork.
After a subprocess is created using the fork function, the subprocess usually calls one exec function to execute another program. When a process calls one exec function, the process is completely replaced by a new program, the new program starts to execute from its main function. Because the process ID before and after exec is called and no new process is created, exec only replaces the body of the current process with another new program, data, heap, and stack segments.
Once a process calls the exec function, it is "dead". The system replaces the code segment with the code of the new program and discards the original data segment and stack segment, and allocate new data segments and stack segments for the new program. The only difference is the process number. That is to say, for the system, it is the same process, but it is already another program. However, some exec functions can inherit information such as environment variables, which can be obtained through the parameters of some functions in the exec series functions.
2. System
System can be viewed as fork + execl + waitpid. The system () function is powerful, but many people do not know much about its principles. First look at the source code of the Linux system function:
# Include <sys/types. h>
# Include <sys/Wait. H>
# Include <errno. h>
# Include <unistd. h>
INT system (const char * character string)
{
Pid_t PID;
Int status;
If (else string = NULL ){
Return (1 );
}
If (pid = fork () <0 ){
Status =-1;
}
Else if (pid = 0 ){
Execl ("/bin/sh", "sh", "-c", character string, (char *) 0 );
-Exit (127); // The sub-process will not execute this statement if it is executed normally.
}
Else {
While (waitpid (PID, & status, 0) <0 ){
If (errno! = Einter ){
Status =-1;
Break;
}
}
}
Return status;
}
First, let's analyze the principle and then look at the above Code to understand it:
If the command received by system is null, the system returns directly. Otherwise, fork generates a sub-process, because fork returns both in two processes: parent process and child process. Check the returned PID here, fork returns 0 in the child process and PID of the child process in the parent process. The parent process uses waitpid to wait until the child process ends. The child process calls execl to start a program to replace itself, execl ("/bin/sh", "sh", "-c", character string, (char *) 0) is to call shell, the shell path is/bin/sh, the subsequent strings are all parameters, and then the sub-process becomes a shell process. The shell parameter is the accept string, which is the parameter accepted by system. In Windows, shell is a command. You must be familiar with what shell did after receiving the command.
If you do not understand the above, I will explain the fork principle: When process a calls fork, the system kernel creates a new process B, and copy the memory image of a to the process space of B. Because A and B are the same, how do they know whether they are parent or child processes, you can see the return value of fork. As mentioned above, fork returns 0 in the child process and the PID of the child process in the parent process.
In Windows, the situation is similar, that is, execl is changed to a smelly and long name, and the parameter name is also changed to make people dizzy. I found a prototype in msdn, let's take a look:
Hinstance ShellExecute (
Hwnd,
Lpctstr lpverb,
Lpctstr lpfile,
Lptstr lpparameters,
Lpctstr lpdirectory,
Int nshowcmd
);
For usage, see:
ShellExecute (null, "open", "C: // A. Reg", null, null, sw_shownormal );
You may wonder that there is a parameter lpdirectory in ShellExecute that is used to pass the environment variable of the parent process, but execl does not exist in Linux, this is because execl is a compiler function (to some extent, hiding the specific system implementation). in Linux, it will then generate a Linux System Call execve. For the prototype, see:
Int execve (const char * file, const char ** argv, const char ** envp );
If you see this, you will understand why system () accepts the environment variables of the parent process. However, after you use system to change the environment variables, the main function returned by system remains unchanged. The cause can be seen from the implementation of system that it is implemented by generating a new process. From my analysis, we can see that there is no process communication between the parent process and the child process, child processes cannot change the environment variables of parent processes.
For returned values, if system () fails to call/bin/sh, 127 is returned, and-1 is returned for other causes of failure. If the returned value is 0, the call is successful but no sub-process is displayed. If the string parameter is a null pointer, a non-zero value is returned. If system () is successfully called, the return value after the shell command is executed is returned. However, the returned value may also be 127 returned when system () fails to call/bin/sh, therefore, it is best to check errno again to confirm the execution is successful.
The Return Value of the shell command can be obtained through wexitstatus (STAT. Macros for processing return values are defined in <sys/Wait. H>, including (Stat is the return value of waitpid ):
Wifexited (STAT) non zero if child exited normally.
Non-zero. If the subroutine Exits normally
Wexitstatus (STAT) Exit code returned by child
Subroutine return exit value
Wifsignaled (STAT) non-zero if child was terminated by a signal
If the sub-process ends with a signal
Wtermsig (STAT) signal number that terminated child
End the Signal Number of the sub-process
Wifstopped (STAT) non-zero if child is stopped
Non-zero if the sub-process is stopped
Wstopsig (STAT) Number of signal that stopped child
The signal number of the sub-process to stop.
Wifcontinued (STAT) non-zero if status was for continued child
Non-zero if the status is a child process that continues running
Wcoredump (STAT) If wifsignaled (STAT) is non-zero, this is non-zero if the process leftbehind a core dump.
If wifsignaled (STAT) is non-zero and the process generates a core dump, this is also non-zero.
3. popen
Popen () is often used to execute a program.
File * popen (const char * command, const char * type );
Int pclose (File * stream );
The popen () function starts a process by creating an MPS queue and calls shell. because the pipeline is defined as one-way, the type parameter can only be defined as read-only or write-only. It cannot be both, and the result stream is also read-only or write-only. the command parameter is a string pointer pointing to a string ending with a null Terminator. This string contains a shell command. this command is sent to/bin/sh and executed with the-C parameter, that is, it is executed by shell. the type parameter is also a pointer to a string ending with a null Terminator. The string must be 'R' or 'W' to indicate whether it is read or write.
The Return Value of the popen () function is a common standard I/O stream. It can only be closed using the pclose () function, rather than the fclose () function. writing to this stream is converted to the standard input to the command. The standard output of the command is the same as that of the function that calls popen, unless this is changed by the command itself. on the contrary, reading a "written by popen" stream is equivalent to reading the standard output of the command, while the standard input of the command is the same as the process that calls the popen function.
Note that the output stream of the popen function is fully buffered by default.
The pclose function waits for the relevant process to end and returns the exit status of a command, just like the wait4 function.
# Include
Int main (INT argc, char * argv [])
{
Char Buf [128];
File * PP;
If (Pp = popen ("ls-L", "R") = NULL)
{
Printf ("popen () error! /N ");
Exit (1 );
}
While (fgets (BUF, sizeof Buf, pp ))
{
Printf ("% s", Buf );
}
Pclose (PP );
Return 0;
}