This article is based on the "Unix/Linux Programming Practice tutorial". This is a book that explains programming for Unix systems. It focuses on practice and is difficult to understand. We recommend that you read this article, there will be a deeper perspective on understanding how UNIX systems work. When I look back and learn other Linux-related things, I feel very different. This is a book that can improve internal strength. I added some delicious explanations so that other people can understand it. I hope you can correct the mistakes.
Shell is a program used to manage processes and run programs. It is used to interact with machines.
Common shell such as SH, Bash, zsh, CSH, and KSh have three main functions:
1. Run the program
Date, ls, and who are all utilities written in C, and shell is responsible for loading them into the memory for running, so shell can be seen as a program starter
2. Manage Input and Output
Using the redirection symbol <,> and pipeline symbol |, You Can Tell Shell to direct the input and output to a file or other processes, or to direct the input and output from a file to a standard input and output. Especially the pipeline, it feels very cool! Implement many functions by combining the basic commands
3. Programmable
Contains variables and controls. In fact, variables are the smallest application of the buffer concept. They are saved to a place and used later. Control: if, while, and control the execution process. With variables and control, the programs executed separately can be put into a file, that is, the so-called script, so that multiple commands can be run at a time and saved for future use. Similar principles apply to other scripting languages.
This article first explains how shell runs a program and writes a shell without variables and control. The old man said, "a journey of thousands of miles begins with a single step ". Shell works like this: After opening a terminal, print the prompt, which is usually the "$" or "#", the stupid human enters the command, and the command is executed, prompt again, endless loop ...... wait until you exit the terminal. For example, if you enter exit, this command exits; or press Ctrl + d after the prompt to generate a file Terminator; or if you click Close in the graphic terminal simulator, this is handled by the window manager. In fact, these three are used to end the endless loop and exit the shell itself.
The main body of the shell is as follows:
While (! End_of_input) {Wait for the human to enter the command; execute the command; wait for the command to end ;}
The end_of_input is generated by the three exit methods mentioned above. In one case, run another shell in the shell, run another shell in the shell, and then run another shell in the shell ...... you can buy a Russian doll. generally, the program exits after it completes its own work (the common programs in the command line interface are like this, but the graphic interface program needs to be closed by humans for interaction ), but because shell is the program that runs its program, it needs another intervention to exit.
To write a shell, you must know:
1. Run a program in the Program (equivalent to creating a process );
2. Wait for the exit of the new program in the program
About processes: running programs. Or in the memory program and some settings, such as status, time, process number, etc. In the output of the PS-x command, each line is the information of a process. TOP command to view Real-time process information. When we started programming, we wrote some single-process programs, such as printing "hello ". But to execute the program twice, you can only enter it again and let it execute again. This can be done by the program itself, that is, using multiple processes. This idea can be analogous to function calls in C language. You can write all the things you want to do into main. When there are repeated jobs, you usually create a subfunction and call it multiple times instead of copying the code.
Execvp call: execvp (Program, Arglist ). program is the name of the called program, and Arglist is the parameter list. It is used to run the program from the program. It will use the environment variable to find the program, that is, ls, WHO, and so on.
Fork call: fork (). Create a new process. The job is to copy the original program, so that there are two identical programs in the memory. These two programs are no longer called programs. Just call them processes. The original meaning of fork is forking. One channel is converted into two channels, and then the fork goes its own way.
Wait call: Wait (& Status). Wait until the sub-process ends. Waiting is divided into blocking and non-blocking, such as drinking a pot of tea. You are a shell. First, create a boiling water process. You can choose to block it, that is, I squatted next to the pot and looked at the hot gas. It can also be non-blocking. If the water opened the pot, there will be a tweet, which is a signal, in addition, the pot can save its status to status. Shell is the initial parent process. Generally, executing a program is blocked, but you cannot see it because the machine is too fast. The background process is non-blocking, that is, the command is followed by "&".
Start work below!
1. Only one program's shell can be run
There is a group of systems that call exec to complete the "run another program in the program" work. The details of how to do it are not further explored, and that is another programming level, this is just to write a small shell. You only need to use this call to call a function other than your main program.
Execvp is used here. below is the code of the "disability" shell that can only run one program, because after the first program you entered is completed, you also quit.
/* Egg_sh.c * Do you think there is an egg first or a chicken? The problem that you don't even know about the chicken or the egg has plagued stupid humans for a long time. Just think there is an egg first, this disability shell is named egg_sh * by the way. It is ugly to separate program names starting with an uppercase letter, such as egsh, real programmers use "_" to separate program names */# include <stdio. h> # include <signal. h> # include <string. h> # define maxargs 20/* Maximum number of parameters */# define arglen 100/* Parameter Buffer length */char * makestring (char * BUF ); int execute (char * Arglist []); int main () {char * Arglist [maxargs + 1];/* parameter array */INT numargs = 0; /* parameter Array Index */ Char argbuf [arglen];/* buffer for storing read content */while (numargs <maxargs) {printf ("Arg [% d]? ", Numargs);/* print prompt */If (fgets (argbuf, arglen, stdin) & * argbuf! = '\ N') Arglist [numargs ++] = makestring (argbuf); else {If (numargs> 0) {Arglist [numargs] = NULL; execute (Arglist ); numargs = 0 ;}}return 0 ;}int execute (char * Arglist []) {execvp (Arglist [0], Arglist ); /* This is the program in the execution program. Arglist [0] is the new program name. Arglist is the parameter list */perror ("execvp failed"); exit (1 );} char * makestring (char * BUF)/** remove the line break at the last position of each parameter and change it to '\ 0', that is, the string terminator of C language * and allocate memory for each parameter, to store them */{char * CP; Buf [strle N (BUF)-1] = '\ 0';/* Change' \ n' to '\ 0' */CP = malloc (strlen (BUF) + 1 ); if (Cp = NULL) {fprintf (stderr, "no memory \ n");/* from the beginning to the present, I have never encountered the problem of insufficient memory = _ =! */Exit (1);} strcpy (CP, Buf);/* copy the content in the parameter buffer to the allocated place */return CP; /* pointer to the location of the returned parameter */}
WC-l egg_sh.c check that there are more than 60 lines of code. That's right, a program that can become a shell is just like this, but it's still an "egg ". Compile and run it like this:
[Email protected]? ./A. Out Arg [0]? Lsarg [1]? -Larg [2]? -Why G [3]? Total usage 32 drwxrwxrwt 4 Root 4096 July 29 12:11. drwxr-XR-x 23 Root 4096 July 10 02:39 .. -rwxr-XR-x 1 hotea 6251 July 29 12:05. out-RW-r -- 1 hotea 1788 July 29 12:05 egg_sh.cdrwxrwxrwt 2 root Root 4096 July 29 08:36. ice-Unix-r -- 1 Root 11 August 2014. x0-lockdrwxrwxrwt 2 root 4096 July 29 2014. x11-unix [email protected]?
You can use it to run other programs. If it is empty, press enter to end the command input. The reason for egg_sh to exit is that execvp overwrites the egg_sh program with the LS program. After the end, egg_sh is gone. To wait for the command after running a program like a real shell, execvp needs to be executed in the new thread. the exit of the Process of LS does not affect the process of egg_sh.
2. The shell that can run multiple programs
In the previous egg shell, only exec is used, so only one program can be executed. Now, with the fork call, you can run multiple programs and put exec on the fork's fork path, and it exits, shell will not exit. After the fork statement is executed, the fork statement returns 0 to the sub-process, and the fork statement returns the PID of the sub-process in the parent process.
The execution process is as follows:
1. prompt-> 2. get the command-> 3. create a new process-> 4. the parent process waits ..................... get the sub-process status-> return to the prompt
|
Sub-process-> exec run new program-> end exit-> exit status
You only need to change the Execute function. The shell that can run multiple programs can complete the most basic work, but it is still uncomfortable to use it, and you have to input a line of content at a time like the egg shell.
Int execute (char * Arglist [])/* use fork () and execvp (), use wait () to wait for the sub-process */{int PID, exitstatus; /* sub-process ID and exit status */PID = fork ();/* create sub-process */switch (PID) {Case-1: perror ("fork failed"); exit (1); case 0: execvp (Arglist [0], Arglist ); /* run the program entered in Shell */perror ("execvp failed"); exit (1); default: While (wait (& exitstatus )! = PID); printf ("Child exited with status % d, % d \ n", exitstatus> 8, exitstatus & 0377);/* exit information */}}
After fork, the above Code is the same in the parent and child processes, but different PIDs lead to different execution parts. If fork is not wrong, the child process executes the part after case 0 because its PID is 0. As a result, the child process exits because exit is called. The child process executes the part after default in the parent process, obtain the exit status of the sub-process, which is saved in exitstatus. You can use or discard it. Print it out here. exitstatus> 8 indicates the exit value, we don't need to use the bitwise of the following and 0377 and the signal we get.
The execution is similar to the following:
[Email protected]? ./A. Out Arg [0]? Lsarg [1]? A. Out big_egg_sh.c egg_sh.cchild exited with status 0, 0arg [0]? Psarg [1]? PID tty time cmd 3708 pts/0 00:00:00 bash 5266 pts/0 00:00:00 A. Out 5268 pts/0 00:00:00 pschild exited with status 0, 0arg [0]? Press Ctrl + darg [0]? Arg [0]? Exitarg [1]? Execvp failed: no such file or directorychild exited with status 1, 0arg [0]? ^ C [email protected]?
You can run multiple programs, but ^ d does not work, and exit is not easy. The reason is as follows: the sub-process calls execvp (exit, null ), here we regard exit as a new program, and we can use the type exit product to see that exit is embedded in shell, that is, it cannot be found in the environment variable path, such as LS, who is mostly in the directory/bin,/usr/bin, which can be found, and CD, exit these embedded commands, it will prompt no such file or directory. in addition, to exit big_egg_sh, we can only use the CTRL + C signal to kill him, while the shell we use in the system uses Ctrl + C to kill the dead, and use Ctrl + D to exit. To prevent big_egg_sh from being killed by ^ C, you can add this sentence to the main function to ignore the signal generated by ^ C.
signal(SIGINT,SIG_IGN)
So far, a rough shell is complete, but it is an egg after all. Next let's turn this egg into a chicken! (Source code at GIT)