Learning System Call

Source: Internet
Author: User
Learning System calling-general Linux technology-Linux programming and kernel information. The following is a detailed description. System Call statistics (2)
One of the system calls related to process management

March 01, 2002

This article introduces the process concept in Linux, and focuses on four important system calls related to Linux Process Management: getpid, fork, exit, and _ exit, some examples are used to illustrate their features and usage.
Some necessary knowledge about processes

Let's take a look at the standard definition of the Process in university textbooks: "A process is a process that can be run concurrently in a dataset ." This definition is very rigorous and difficult to understand. If you do not understand this sentence at all, you may wish to look at the author's own not rigorous explanation. As we all know, an executable file on the hard disk is often called a program. In Linux, when a program starts to run, the execution is completed and exited, the part in the memory is called a process.

Of course, this explanation is not complete, but it is easy to understand. In the following articles, we will give a more comprehensive understanding of the process.

Introduction to Linux Processes

Linux is a multi-task operating system, that is, multiple processes can be executed at the same time. If you have a certain understanding of the computer hardware system, you will know that the single-CPU computer we commonly use can only execute one command within one time segment, in Linux, how does one implement simultaneous execution of multiple processes? Originally, Linux used a method called process scheduling. First, it assigned a certain running time for each process, which is usually very short, in milliseconds, and then select one of the processes to be put into operation according to certain rules. Other processes are waiting for the moment. When the running process consumes time, or exit after the execution is completed, or pause for some reason, Linux will re-schedule and select the next process for running. Because each process occupies a very short time slice, from the perspective of our users, it is as if multiple processes are running at the same time.

In Linux, each Process is assigned a data structure when it is created, called a Process Control Block (PCB ). PCB contains a lot of important information for System Scheduling and process execution. The most important thing is the process ID, which is also called the process identifier, is a non-negative integer. It is the only identifier of a process in the Linux operating system. In our most commonly used I386 architecture (that is, the architecture used by the PC, the variation range of a non-negative integer is 0-32767, which is also the ID of all possible processes. In fact, from the process ID name, we can see that it is the ID number of the process, each person's ID number will not be the same, each process's ID will not be the same.

One or more processes can be combined to form a process group, and one or more process groups can be combined to form a session ). In this way, we have the ability to perform batch operations on processes, such as sending signals to a process group to send signals to each process in the group.

Finally, let's use the ps command to see how many processes are running in our system:

$ Ps-aux (The following is the running result on my computer, and your result is likely to be different from this .)
Root 1 0.1 0.4 1412 520? S May15 0: 04 init [3]
Root 2 0.0 0.0 0 0? SW May15 0: 00 [keventd]
Root 3 0.0 0.0 0 0? SW May15 0: 00 [kapm-idled]
Root 4 0.0 0.0 0 0? SWN May15 0: 00 [ksoftirqd_CPU0]
Root 5 0.0 0.0 0 0? SW May15 0: 00 [kswapd]
Root 6 0.0 0.0 0 0? SW May15 0: 00 [kreclaimd]
Root 7 0.0 0.0 0 0? SW May15 0: 00 [bdflush]
Root 8 0.0 0.0 0 0? SW May15 0: 00 [kupdated]
Root 9 0.0 0.0 0 0? SW <May15 0: 00 [mdrecoveryd]
Root 13 0.0 0.0 0 0? SW May15 0: 00 [kjournald]
Root 132 0.0 0.0 0 0? SW May15 0: 00 [kjournald]
Root 673 0.0 0.4 1472 592? S May15 0: 00 syslogd-m 0
Root 678 0.0 0.8 2084 1116? S May15 0: 00 klogd-2
Rpc 698 0.0 0.4 1552 588? S May15 0: 00 portmap
Rpcuser 726 0.0 0.6 1596 764? S May15 0: 00 rpc. statd
Root 839 0.0 0.4 1396 524? S May15 0: 00/usr/sbin/apmd-p
Root 908 0.0 0.7 2264 1000? S May15 0: 00 xinetd-stayalive
Root 948 0.0 1.5 5296 1984? S May15 0: 00 sendmail: accepti
Root 967 0.0 0.3 1440 484? S May15 0: 00 gpm-t ps/2-m/d
Wnn 987 0.0 2.7 4732 3440? S May15 0: 00/usr/bin/cserver
Root 1005 0.0 0.5 1584 660? S May15 0: 00 crond
Wnn 1025 0.0 1.9 3720 2488? S May15 0: 00/usr/bin/tserver
Xfs 1079 0.0 2.5 4592 3216? S May15 0: 00 xfs-droppriv-da
Daemon 1115 0.0 0.4 1444 568? S May15 0: 00/usr/sbin/atd
Root 1130 0.0 0.3 1384 448 tty1 S May15/sbin/mingetty tt
Root 1131 0.0 0.3 1384 448 tty2 S May15/sbin/mingetty tt
Root 1132 0.0 0.3 1384 448 tty3 S May15/sbin/mingetty tt
Root 1133 0.0 0.3 1384 448 tty4 S May15/sbin/mingetty tt
Root 1134 0.0 0.3 1384 448 tty5 S May15/sbin/mingetty tt
Root 1135 0.0 0.3 1384 448 tty6 S May15/sbin/mingetty tt
Root 8769 0.0 0.6 1744 812? S in. telnetd: 192.1
Root 8770 0.0 0.9 2336 1184 pts/0 S login -- lei
Lei 8771 0.1 0.9 2432 1264 pts/0 S-bash
Lei 8809 0.0 0.6 2764 808 pts/0 R ps-aux

In addition to the title, each row represents a process. In each column, the PID column represents the process ID of each process, and the COMMAND column represents the name of the process or the COMMAND line called in Shell, I will not explain it any more. Interested readers can refer to relevant books.


In the kernel of version 2.4.4, getpid is a system call of version 20th. Its prototype in the Linux function library is:

# Include /* Define the pid_t type */
# Include /* Define functions */
Pid_t getpid (void );

The function of getpid is to return the process ID of the current process. See the following example:

/* Getpid_test.c */
# Include
Main ()
Printf ("The current process ID is % d \ n", getpid ());

Careful readers may notice that the program definition does not contain the header file sys/types. h. This is because the pid_t type is not used in the program. The pid_t type is the process ID type. In fact, in the i386 architecture (that is, the architecture of general PC computers), The pid_t type is fully compatible with the int type, we can use the integer data processing method to process pid_t data, for example, print it out with "% d.

Compile and run the program getpid_test.c:

$ Gcc getpid_test.c-o getpid_test
The current process ID is 1980
(Your running result may be different from this number, which is normal .)

Run it again:

The current process ID is 1981

As we can see, even though it is the same application, the process identifiers assigned at each run are different.


In the kernel version 2.4.4, fork is called by the 2nd system. Its prototype in the Linux function library is:

# Include /* Define the pid_t type */
# Include /* Define functions */
Pid_t fork (void );

Just look at the fork name. It may be rare for several people to guess what it is. Fork is called to replicate a process. When a process calls it, two processes are almost identical, and we get a new process. The fork name is said to have come from a workflow that is somewhat similar to the fork shape.

In Linux, there is only one way to create a new process, that is, the fork we are introducing. Other library functions, such as system (), seem to be capable of creating new processes. If you can look at their source code, you will understand that they actually call fork internally. This includes running the application under the command line. The new process is also created by calling fork by shell. Fork has some interesting features. Let's use a small program to learn more about it.

/* Fork_test.c */
# Include
# Inlcude
Main ()
Pid_t pid;

/* There is only one process at this time */
Pid = fork ();
/* Two processes are running at the same time */
If (pid <0)
Printf ("error in fork! ");
Else if (pid = 0)
Printf ("I am the child process, my process ID is % d \ n", getpid ());
Printf ("I am the parent process, my process ID is % d \ n", getpid ());

Compile and run:

$ Gcc fork_test.c-o fork_test
I am the parent process, my process ID is 1991
I am the child process, my process ID is 1992

When you look at this program, you must first understand the concept: Before the statement pid = fork (), only one process is executing this code, but after this statement, the code of the two processes is completely the same. The next statement to be executed is if (pid = 0 ).......

In the two processes, the original one is called the "parent process" and the new one is called the "Child process ". The difference between parent and child processes is not only the process ID, but also the variable pid value. The pid stores the fork return value. One of the wonders of fork calling is that it is called only once, but can return twice. It may have three different return values:

In the parent process, fork returns the process ID of the newly created sub-process;
In the child process, fork returns 0;
If an error occurs, fork returns a negative value;
There are two possible reasons for fork errors: (1) the current number of processes has reached the limit set by the system, and the errno value is set to EAGAIN. (2) If the system memory is insufficient, the errno value is set to ENOMEM. (For more information about errno, see the first article in this series .)

Fork system calls are unlikely to have errors, and if an error occurs, it is generally the first error. If the second error occurs, it indicates that the system has no memory to allocate and is on the verge of crash. This is rare for Linux.

Speaking of this, smart readers may have fully understood the remaining code. If the pid is smaller than 0, it indicates an error has occurred. If the pid is = 0, it indicates that fork has returned 0, this indicates that the current process is a sub-process. Execute printf ("I am the child! "); Otherwise (else), the current process is the parent process and runs printf (" I am the parent! "). The perfectionist may think this is redundant because each of the two processes has a statement that they can never execute. You don't have to worry too much about this. After all, many years ago, UNIX's originator wrote programs on computers with a low memory that could not be imagined at the time. With our current "massive" memory, you can leave these bytes out of the cloud.

Here, some readers may have doubts: If the sub-process after fork is almost the same as the parent process, and the only way to generate a new process in the system is fork, isn't all processes in the system identical? What should we do when we want to execute a new application? From our experience in Linux, we know that this problem does not exist. As for the method used, we will leave this question for further discussion.


In the kernel version 2.4.4, exit is called on the 9th, and its prototype in the Linux function library is:

# Include
Void exit (int status );

It is not as hard to understand as fork. From the exit name, we can see that this system call is used to terminate a process. No matter where the program is located, as long as the exit system call is executed, the process will stop all the remaining operations, clear various data structures including the PCB, and terminate the operation of the process. See the following program:

/* Exit_test1.c */
# Include
Main ()
Printf ("this process will exit! \ N ");
Exit (0 );
Printf ("never be displayed! \ N ");

Compile and run:

$ Gcc exit_test1.c-o exit_test1
This process will exit!

We can see that the program does not print "never be displayed! \ N ", because before that, when exit (0) is executed, the process has been terminated.

The exit system calls the status parameter with an integer type. We can use this parameter to pass the status at the end of the process. For example, if the process ends normally or unexpectedly, in general, 0 indicates that the process has not ended normally. Other values indicate that an error has occurred and the process has ended abnormally. In actual programming, we can use the wait system to call and receive the return values of sub-processes, so as to handle different situations. We will introduce the wait details in the future.

Exit and _ exit

As a system call, _ exit and exit are a pair of twins. To what extent are they similar, we can find the answer from the Linux source code:

# Define _ NR _ exit _ NR_exit/* from the File include/asm-i386/unistd. h 334th rows */

"_ NR _" is the prefix for each system call in the Linux source code. Note that there are two underscores before the first exit and only one underline before the second exit.

At this time, anyone who understands C language and has a clear mind will say that _ exit and exit are no different, but let's talk about the difference between them, these differences are mainly reflected in their definitions in the function library. The prototype of _ exit in the Linux function library is:

# Include
Void _ exit (int status );

Compared with exit, the exit () function is defined in stdlib. h, while _ exit () is defined in unistd. h, from the name, stdlib. h seems better than unistd. h. What is the difference between them? Let's take a look at the flowchart first. Through this process, we will have a more intuitive understanding of the implementation process of these two system calls.

It can be seen that the _ exit () function has the simplest function: directly stop the process, clear its memory space, and destroy its various data structures in the kernel; exit () some functions are encapsulated based on these elements and several processes are added before execution and exit. This is also the reason why some people think that exit is no longer a pure system call.

The biggest difference between the exit () function and the _ exit () function is that the exit () function checks the file opening before calling the exit system, and writes the content in the File Buffer back to the file, is the "clear I/O buffer" item in the figure.

In the standard library of Linux, there is a set of functions called "Advanced I/O". The well-known printf (), fopen (), fread (), and fwrite () are listed in this column, they are also referred to as "buffer I/O (buffered I/O)", which is characterized by a buffer in the memory corresponding to each opened file. Each time a file is read, several more records will be read, so that the next time you read the file, you can directly read it from the memory buffer. Each time you write the file, it is only written into the buffer zone in the memory, when a certain number of conditions are met, or a specific character, such as the line break \ n and the file Terminator EOF, is met, and then the content in the buffer is written to the file at one time, this greatly increases the speed of reading and writing files, but it also brings us a little trouble in programming. If there is some data, we think that the file has been written, because it does not meet the specific conditions, they are only saved in the buffer, then we use _ exit () if a function is used to directly shut down the process, data in the buffer will be lost. If you want to ensure data integrity, you must use the exit () function.

See the following routine:

/* Exit2.c */
# Include
Main ()
Printf ("output begin \ n ");
Printf ("content in buffer ");
Exit (0 );

Compile and run:

$ Gcc exit2.c-o exit2
Output begin
Content in buffer
/* _ Exit1.c */
# Include
Main ()
Printf ("output begin \ n ");
Printf ("content in buffer ");
_ Exit (0 );

Compile and run:

$ Gcc _ exit1.c-o _ exit1
$./_ Exit1
Output begin

In Linux, both standard input and standard output are processed as files. Although they are special files, from the programmer's perspective, they are no different from common files that store data on hard disks. Like all other files, they also have their own buffer after being opened.

Let's take a look at the previous descriptions to find out why these two programs have different results. I believe that if you understand what I mentioned above, you will easily draw a conclusion.

To be continued

In this article, we have a preliminary understanding of Linux Process Management, and on this basis we have learned four system calls: getpid, fork, exit, and _ exit. In the next article, we will learn about other system calls related to Linux Process Management and further discuss them.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.