Detailed analysis of multi-process programming in Linux

Source: Internet
Author: User
Author: wildwolf Source: CCID technical community
(1) understand the process structure in Linux

The next process in Linux has three parts of data in the memory: "data segment", "Stack segment", and "code segment". In fact, people who have learned assembly languages must know that, generally, the CPU is like i386 and has the above three register segments to facilitate the operation of the operating system. "Code segment", as its name implies, stores the data of program code. If several processes on the machine run the same program, they can use the same code segment. The stack segment stores the return address of the subroutine, the parameters of the subroutine, and the local variables of the program. The data segment stores the global variables, constants, and dynamic data space allocated by the Program (for example, space obtained using functions such as malloc ). There are many details here, so we will not discuss them much here. If the system runs several identical programs at the same time, the same stack segment and data segment cannot be used between them. (2) how to use fork to generate a new process in Linux is called the fork function. This function name is "Forks" in English. Why is this name used? Because a process is running, if fork is used, another process is generated, so the process is "Forked", so this name is very good. The following describes how to use fork. This section demonstrates the basic frame for using fork:


Void main (){
Int I;
If (fork () = 0 ){
/* Sub-process program */
For (I = 1; I <1000; I ++)
Printf ("This is child process/n ");
}
Else {
/* Parent process program */
For (I = 1; I <1000; I ++)
Printf ("This is process/n ");
}
}

After the program runs, you can see that the screen displays one thousand pieces of information each printed by the child process and the parent process. If the program is still running, you can use the ps command to see that there are two running programs in the system. So what happens when this fork function is called? When a program calls the fork function, the system prepares the preceding three segments for a new process. First, the system allows the new process and the old process to use the same code segment, because their programs are the same, the system copies a copy of the data segment and stack segment to the new process. In this way, all data of the parent process can be left to the child process. However, once a child process starts running, it inherits all the data of the parent process, but in fact the data has been separated and there is no impact between them, that is, they no longer share any data. If the two processes want to share any data, they need to use another function (shmget, shmat, shmdt, and so on. Now there are two processes. For the parent process, the fork function returns the process Number of the subroutine, and for the subroutine, the fork function returns zero. In this way, for the program, as long as you determine the return value of the fork function, you will know whether you are in the parent process or child process. Readers may ask, if a large program is running and its data segments and stacks are large, and a fork will be copied once, isn't the system overhead of fork very high? In fact, UNIX has its own solution. As you know, generally, the CPU allocates space in units of "pages", such as INTEL's CPU. One page is usually 4 K bytes, both the Data Segment and the stack segment are composed of many "pages". The fork function copies these two segments, but they are logical and not physical. That is to say, when fork is actually executed, the data segments and stack segments of the two processes in the physical space are still shared. When a process writes data, at this time, the data between the two processes is different, and the system physically separates the different "pages. The space overhead of the system can be minimized. A little humorous: The following shows a small program that is enough to "Screw Up" Linux. Its source code is very simple:


void main() 
{
for(;;) fork();
}

This program does nothing, that is, fork in an endless loop. The result is that the program continuously produces processes, and these processes continuously generate new processes. Soon, the process of the system is full, the system is "overwhelmed" by so many ongoing processes ". No need for root. Anyone running the above program is enough to let the system die. Haha, but this is not the reason for Linux's insecurity, because as long as the system administrator is smart enough, he or she can set the maximum number of processes that can be run for each user in advance, as long as it is not root, the number of processes that can run may be less than of the total number of processes that the system can run. In this way, the system administrator can deal with the above malicious programs.

(3) how to start the execution of another program. Let's take a look at how a process starts the execution of another program. In Linux, exec functions are used. There are more than one exec function, but they are roughly the same. In Linux, they are: execl, execlp, execle, execv, execve and execvp. I will only use execlp as an example. What is the difference between other functions and execlp? Please use the manexec command to learn about their details. Once a process calls the exec function, it is "dead". The system replaces the code segment with the code of the new program and discards the original data segment and stack segment, and allocate new data segments and stack segments for the new program. The only difference is the process number. That is to say, for the system, it is the same process, but it is already another program. (However, Some exec functions can inherit information such as environment variables .) So what if my program wants to start the execution of another program but still wants to continue running? That is, combined with fork and exec. The following code starts other programs:


Char command [256];
Void main ()
{
Int rtn;/* the return value of the sub-process */
While (1 ){
/* Read the command to be executed from the terminal */
Printf ("> ");
Fgets (command, 256, stdin );
Command [strlen (command)-1] = 0;
If (fork () = 0 ){
/* The sub-process executes this command */execlp (command, command );
/* If the exec function returns, the command is not executed normally and the error message is printed */
Perror (command );
Exit (errorno );
}
Else {
/* Parent process. Wait until the child process ends and print the return value of the child process */
Wait (& rtn );
Printf ("child process return % d/n",. rtn );
}
}
}


This program reads and executes commands from the terminal. After the execution is complete, the parent process continues to wait for the command to be read from the terminal. If you are familiar with DOS and WINDOWS system calls, you must know that DOS/WINDOWS also has exec functions. The usage is similar, but DOS/WINDOWS also has spawn functions, because DOS is a single-task system, it can only "parent process" resident in the machine and then execute "sub-process", which is a function of the spawn class. WIN32 is already a multi-task system, but it also retains the spawn class functions. The methods for implementing the spawn function in WIN32 are similar to those in the preceding UNIX, after a sub-process is opened, the parent process continues to run after the sub-process ends. UNIX is a multi-task system at the beginning, so from the core point of view, the spawn class function is not required. In addition, there is a simpler function system for executing other programs. It is a high-level function, which is actually equivalent to executing a command in the SHELL environment, exec functions are called by low-level systems. (4) What is the difference between a Linux Process and a Win32 process/thread? anyone familiar with WIN32 programming must know that the WIN32 process management method is very different from that on UNIX. in UNIX, there is only a process concept, but there is a "Thread" concept in WIN32. What is the difference between UNIX and WIN32? The fork in UNIX was the result of a long period of hard exploration in theory and practice by developers in the early 1970s S. On the one hand, it made the operating system pay the minimum price for process management, on the other hand, it provides programmers with a simple and clear multi-process method. The process/thread in WIN32 inherits from OS/2. In WIN32, "process" refers to a program, and "Thread" refers to an execution "clue" in a "process ". At the core, there is no major difference between WIN32 Multi-process and UNIX. The thread in WIN32 is equivalent to a UNIX process and is actually executing code. However, in WIN32, threads in the same process share data segments. This is the biggest difference with UNIX processes. The following section shows how the next WIN32 process starts a thread: (Note that this is a terminal-mode program with no graphical interface)


int g;
DWORD WINAPI ChildProcess( LPVOID lpParameter ){
int i;
for ( i = 1; i <1000; i ++) {
g ++;
printf( "This is Child Thread: %d/n", g );
}
ExitThread( 0 );
}; void main()
{
int threadID;
int i;
g = 0;
CreateThread( NULL, 0, ChildProcess, NULL, 0, &threadID );
for ( i = 1; i <1000; i ++) {
g ++;
printf( "This is Parent Thread: %d/n", g );
}
}

In WIN32, The CreateThread function is used to create a thread. Unlike UNIX, a thread does not run from the creation, but is specified by CreateThread. The thread starts to run from that function. This program is the same as the previous UNIX program, with 1000 pieces of information printed by each of the two threads. ThreadID is the thread number of the subthread. In addition, the global variable g is shared between the subthread and the parent thread. This is the biggest difference with UNIX. As you can see, WIN32 processes/threads are more complex than UNIX, and it is not difficult to implement threads similar to WIN32 in UNIX. As long as the fork is used, the sub-process can call the ThreadProc function, in addition, you can set up a shared data zone for global variables. However, fork-like functions cannot be implemented in WIN32. Therefore, although the library functions provided by the C language compiler under WIN32 are compatible with most UNIX library functions, fork still cannot be implemented. For multi-task systems, sharing the data zone is necessary, but it is also a problem that is easy to cause confusion. in WIN32, a programmer can easily forget that the data between threads is shared, after a thread modifies a variable, the other thread modifies it again, causing a program issue. However, in UNIX, because variables are not shared, programmers explicitly specify the data to be shared, making the program clearer and safer. Linux also has its own function called clone, which is not available in other UNIX systems and is not provided by Linux. (to use this function, you need to re-compile the kernel, and set the CLONE_ACTUALLY_WORKS_ OK option). The clone function provides more functions for creating new processes, including functions such as fully shared data segments.

As for WIN32's "process" concept, its meaning is "application", which is equivalent to exec in UNIX.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.