I. Preface
Shell programming is widely used in the Unix/Linux World. Mastering shell programming is also the only way to become an excellent Unix/Linux developer and system administrator. The main task of script debugging is to find the cause of the script error and locate the wrong row in the script source code. Common means include analyzing the output error information, add debugging statements to the script and output debugging information to help diagnose errors and use debugging tools. However, compared with other advanced languages, the shell interpreter lacks the corresponding debugging mechanism and debugging tool support, and the output error information is often ambiguous. When a beginner debugs a script, in addition to knowing that echo statements are used to output some information, there is no such method. However, it is really complicated to simply rely on a large number of echo statements to diagnose errors, therefore, it is common for beginners to complain that shell scripts are too difficult to debug. This article will systematically introduce some important shell script debugging technologies, hoping to benefit beginners of shell.
The target readers of this article are developers, testers and system administrators in Unix/Linux environments, and require the readers to have basic shell programming knowledge. The example used in this article passes the test under bash3.1 + RedHat Enterprise Server 4.0, but the debugging skills described here should also apply to other shells.
2. Output debugging information in shell scripts
It is the most common debugging method to display the relevant information of some key or error points by adding debugging statements to the program. Shell programmers usually use echo (Ksh programmers often use print) statements to output information, but it is troublesome to rely only on the output tracing information of echo statements, A large number of echo statements added to the script in the debugging phase have to be removed one by one during product delivery. This section describes how to output debugging information conveniently and effectively.
1. Use the Trap Command
The trap command is used to capture specified signals and execute predefined commands.
The basic syntax is:
Trap 'command' Signal
Signal is the signal to be captured, and command is the command to be executed after the specified signal is captured. You can use the kill-l command to view all available signal names in the system. The command executed after the signal is captured can be any one or more valid shell statements or a function name.
When a shell script is executed, three so-called "pseudo signals" are generated. (the reason is that these three signals are generated by shell, other signals are generated by the operating system). It is very helpful for debugging to capture these three "pseudo signals" by using the trap command and output relevant information.
Table 1. Shell pseudo Signal
Signal name |
When to generate |
Exit |
Exit from a function or complete Script Execution |
Err |
When a command returns a non-zero status (indicating that the command fails to be executed) |
Debug |
Before each command in the script is executed |
By capturing the exit signal, we can output the values of certain variables to be tracked when the shell script is aborted or exited from the function, and then determine the script execution status and cause of error, the usage is as follows:
Trap 'command' exit or trap 'command' 0
By capturing the err signal, we can easily track unsuccessful commands or functions and output relevant debugging information. Below is a sample program for capturing the err signal, $ lineno is a built-in shell variable, representing the current row number of the shell script.
$ cat -n exp1.sh
1 ERRTRAP ()
2 {
3 echo "[LINE: $ 1] Error: Command or function exited with status $?"
4}
5 foo ()
6 {
7 return 1;
8 }
9 trap 'ERRTRAP $ LINENO' ERR
10 abc
11 foo
The output is as follows:
$ sh exp1.sh
exp1.sh: line 10: abc: command not found
[LINE: 10] Error: Command or function exited with status 127
[LINE: 11] Error: Command or function exited with status 1
In the process of debugging, in order to track the value of some variables, we often need to insert the same echo statement in many places of the shell script to print the value of the related variable, which is cumbersome and awkward. By capturing DEBUG signals, we only need a trap statement to complete the tracking of related variables.
The following is an example program that tracks variables by capturing DEBUG signals:
$ cat –n exp2.sh
1 #! / Bin / bash
2 trap 'echo “before execute line: $ LINENO, a = $ a, b = $ b, c = $ c”' DEBUG
3 a = 1
4 if ["$ a" -eq 1]
5 then
6 b = 2
7 else
8 b = 1
9 fi
10 c = 3
11 echo "end"
The output is as follows:
$ sh exp2.sh
before execute line: 3, a =, b =, c =
before execute line: 4, a = 1, b =, c =
before execute line: 6, a = 1, b =, c =
before execute line: 10, a = 1, b = 2, c =
before execute line: 11, a = 1, b = 2, c = 3
end
From the operation results, it can be clearly seen that the value of the relevant variable changes after each command is executed. At the same time, by analyzing the line numbers printed from the running results, you can see the execution trajectory of the entire script, and you can determine which conditional branches are executed and which conditional branches are not executed.
2. Use the tee command
In shell scripts, pipelines and input and output redirection are used very much. Under the effect of pipelines, the execution results of some commands directly become the input of the next command. If we find that the execution results of a batch of commands connected by pipes are not as expected, we need to gradually check the execution results of each command to determine where the problem is, but because of the use of pipes, these intermediate results will not be displayed. On the screen, it brings difficulties to debugging, at this time we can use the tee command.
The tee command will read data from standard input, output its content to a standard output device, and save the content to a file. For example, there are the following script fragments, whose function is to obtain the local ip address:
ipaddr = `/ sbin / ifconfig | grep 'inet addr:' | grep -v '127.0.0.1'
| cut -d: -f3 | awk '{print $ 1}' `
#NOTE: The entire sentence after the = sign is enclosed in backticks (the key to the left of the number 1 key).
echo $ ipaddr
Running this script, the actual output is not the local ip address, but the broadcast address. At this time, we can use the tee command to output some intermediate results and modify the above script fragment to:
ipaddr = `/ sbin / ifconfig | grep 'inet addr:' | grep -v '127.0.0.1'
| tee temp.txt | cut -d: -f3 | awk '{print $ 1}' `
echo $ ipaddr
After that, execute this script again, and then check the contents of the temp.txt file:
$ cat temp.txt
inet addr: 192.168.0.1 Bcast: 192.168.0.255 Mask: 255.255.255.0
We can find that the second column of the intermediate results (separated by :) between them contains the IP address, and the third column is intercepted using the cut command in the above script, so we only need to cut -d in the script : -f3 changed to cut -d: -f2 to get the correct result.
Specific to the above script example, we may not need the help of the tee command. For example, we can execute each command connected by pipes and view the output of each command to diagnose errors, but in some complex shell scripts , These commands connected by pipes may depend on some other variables defined in the script. At this time, we want to run each command in stages at the prompt. It will be very troublesome. Simply insert a tee between the pipes. Command to view the intermediate results will be more convenient.
3. Use "debug hook"
In C language programs, we often use the DEBUG macro to control whether to output debugging information. In a shell script, we can also use this mechanism, as shown in the following code:
if [“$ DEBUG” = “true”]; then
echo “debugging” # debug information can be output here
fi
Such code blocks are often referred to as "debug hooks" or "debug blocks". The debug hook can output any debugging information you want to output. The advantage of using the debug hook is that it can be controlled by the DEBUG variable. In the development and debugging stage of the script, you can first execute the export DEBUG = true command to open the debug hook to It outputs debugging information, and when the script is delivered for use, there is no need to bother to delete the debugging statements in the script one by one.
If you use the if statement to judge the value of the DEBUG variable in every place where you need to output debugging information, it is still more cumbersome. By defining a DEBUG function, the process of implanting debugging hooks can be more concise and convenient, as shown in the following code:
$ cat –n exp3.sh
1 DEBUG ()
2 {
3 if ["$ DEBUG" = "true"]; then
4 $ @
5 fi
6}
7 a = 1
8 DEBUG echo "a = $ a"
9 if ["$ a" -eq 1]
10 then
11 b = 2
12 else
13 b = 1
14 fi
15 DEBUG echo "b = $ b"
16 c = 3
17 DEBUG echo "c = $ c"
In the DEBUG function shown above, any command passed to it will be executed, and this execution process can be controlled by the value of the DEBUG variable. We can call all the commands related to debugging as the parameters of the DEBUG function. , Very convenient.
Three. Use shell execution options
The debugging method described in the previous section is to modify the source code of the shell script to output relevant debugging information to locate the error. Is there a way to debug the shell script without modifying the source code? The answer is to use the shell's execution options. This section will introduce the usage of some common options:
-n only read shell scripts, but do not actually execute
-x Enter trace mode, display every command executed
-c "string" read commands from strings
"-N" can be used to test the shell script for syntax errors, but it will not actually execute the command. After the shell script is written and before it is actually executed, it is a good practice to first use the "-n" option to test the script for syntax errors. Because some shell scripts will affect the system environment when they are executed, such as generating or moving files, if you find a syntax error during actual execution, you have to do some recovery work of the system environment manually to continue testing this script.
The "-c" option causes the shell interpreter to read and execute shell commands from a string rather than from a file. When you need to temporarily test the execution results of a small script, you can use this option as follows:
sh -c 'a = 1; b = 2; let c = $ a + $ b; echo "c = $ c"'
The "-x" option can be used to track script execution and is a powerful tool for debugging shell scripts. The "-x" option causes the shell to display every command line it actually executes during the execution of the script, and displays a "+" sign at the beginning of the line. The "+" sign shows the content of the command line after variable substitution, which is helpful to analyze what command is actually executed. The "-x" option is simple and convenient to use and can easily handle most shell debugging tasks. It should be regarded as the preferred debugging method.
If the trap ‘command’ DEBUG mechanism described earlier in this article is combined with the “-x” option, we can not only output each command actually executed, but also track the value of related variables line by line, which is very helpful for debugging.
Taking the exp2.sh described above as an example, now add the "-x" option to execute it:
$ sh –x exp2.sh
+ trap 'echo "before execute line: $ LINENO, a = $ a, b = $ b, c = $ c"' DEBUG
++ echo 'before execute line: 3, a =, b =, c ='
before execute line: 3, a =, b =, c =
+ a = 1
++ echo 'before execute line: 4, a = 1, b =, c ='
before execute line: 4, a = 1, b =, c =
+ '[' 1 -eq 1 ']'
++ echo 'before execute line: 6, a = 1, b =, c ='
before execute line: 6, a = 1, b =, c =
+ b = 2
++ echo 'before execute line: 10, a = 1, b = 2, c ='
before execute line: 10, a = 1, b = 2, c =
+ c = 3
++ echo 'before execute line: 11, a = 1, b = 2, c = 3'
before execute line: 11, a = 1, b = 2, c = 3
+ echo end
end
In the above result, the line preceded by "+" is the command actually executed by the shell script, the line preceded by "++" is the command specified in the trap mechanism, and the other lines are output information.
Execution options of the shell can be specified not only when the shell is started, but also in the script with the set command. "set -parameter" means to enable an option, "set + parameter" means to close an option. Sometimes we do not need to use the "-x" option to track all the command lines at startup. At this time, we can use the set command in the script, as shown in the following script fragment:
set -x #Start the "-x" option
Block to be traced
set + x #Turn off the "-x" option
The set command can also be called using the debug hook-DEBUG function introduced in the previous section, which can avoid the trouble of deleting these debug statements when the script is delivered, as shown in the following script fragment:
DEBUG set -x #Start the "-x" option
Block to be traced
DEBUG set + x #Turn off the "-x" option
4. Enhancement of "-x" option
The "-x" execution option is currently the most commonly used means of tracing and debugging shell scripts, but the debugging information it outputs is limited to each actually executed command after variable substitution and a "+" prompt at the beginning of the line, Actually, there is no such important information as the line number, which is very inconvenient for the debugging of complex shell scripts. Fortunately, we can skillfully use some environment variables built into the shell to enhance the output information of the "-x" option. Here are a few environment variables built into the shell:
$ LINENO
Represents the current line number of a shell script, similar to the built-in macro in C language __LINE__
$ FUNCNAME
The name of the function is similar to the built-in macro __func__ in C language, but the macro __func__ can only represent the name of the current function, and $ FUNCNAME is more powerful. It is an array variable that contains the entire call chain The names of all functions, so the variable $ {FUNCNAME [0]} represents the name of the function currently being executed by the shell script, and the variable $ {FUNCNAME [1]} represents the name of the function calling the function $ {FUNCNAME [0]}, and so on.
$ PS4
The main prompt variable $ PS1 and the second-level prompt variable $ PS2 are relatively common, but few people notice the role of the fourth-level prompt variable $ PS4. We know that using the "-x" execution option will display every command actually executed in the shell script, and the value of $ PS4 will be displayed in front of each command output by the "-x" option. In the Bash Shell, the default value of $ PS4 is the "+" sign. (Now I know why when using the "-x" option, there is a "+" sign in front of the output command?).
Taking advantage of the $ PS4 feature and redefining the value of $ PS4 by using some built-in variables, we can enhance the output information of the "-x" option. For example, first execute export PS4 = '+ {$ LINENO: $ {FUNCNAME [0]}}', and then use the "-x" option to execute the script, you can display the line number and the affiliation in front of each actual command Function name.
The following is an example of a shell script with bugs. This article will use this script to demonstrate how to use "-n" and enhanced "-x" execution options to debug shell scripts. This script defines a function isRoot (), used to determine whether the current user is the root user, if not, then abort the execution of the script
$ cat –n exp4.sh
1 #! / Bin / bash
2 isRoot ()
3 {
4 if ["$ UID" -ne 0]
5 return 1
6 else
7 return 0
8 fi
9 }
10 isRoot
11 if ["$?" -Ne 0]
12 then
13 echo "Must be root to run this script"
14 exit 1
15 else
16 echo "welcome root user"
17 #do something
18 fi
First execute sh –n exp4.sh to check the syntax, and the output is as follows:
$ sh –n exp4.sh
exp4.sh: line 6: syntax error near unexpected token `else '
exp4.sh: line 6: `else '
A syntax error was found. By carefully checking the commands before and after line 6, we found that the if statement on line 4 is missing from the then keyword (people who are used to C programs are prone to make this error). We can modify line 4 to if ["$ UID" -ne 0]; then to fix this error. Run sh -n exp4.sh again to check the grammar, and no errors are reported. Next, you can actually execute the script, the execution results are as follows:
$ sh exp4.sh
exp2.sh: line 11: [1: command not found
welcome root user
Although the script had no syntax errors, it reported errors during execution. The error message is still very strange "[1: command not found". Now we can try to customize the value of $ PS4 and use the "-x" option to track:
$ export PS4 = '+ {$ LINENO: $ {FUNCNAME [0]}}'
$ sh –x exp4.sh
+ {10:} isRoot
+ {4: isRoot} '[' 503 -ne 0 ']'
+ {5: isRoot} return 1
+ {11:} '[1' -ne 0 ']'
exp4.sh: line 11: [1: command not found
+ {16:} echo 'welcome root user'
welcome root user
From the output results, we can see the statement that the script is actually executed, the line number of the statement and the name of the function that it belongs to are also printed out, from which the execution path of the script and the internal execution of the function called . Since the error is reported on line 11 during execution, this is an if statement. Let's compare and analyze the tracking result on line 4 of the same if statement:
+ {4: isRoot} '[' 503 -ne 0 ']'
+ {11:} '[1' -ne 0 ']'
It can be seen that due to the lack of a space after the [sign on the 11th line, the [sign and the value of the variable $? Next to it are regarded as a whole by the shell interpreter, and try to execute the whole as a command , So there is an error message like "[1: command not found". Just insert a space after the [sign and everything will work.
There are other built-in variables in the shell that are helpful for debugging. For example, there are a lot of built-in variables that are helpful for debugging in the Bash Shell, such as BASH_SOURCE, BASH_SUBSHELL. You can view them by man sh or man bash, and then according to your For debugging purposes, use these built-in variables to customize $ PS4, so as to enhance the output information of the "-x" option.
V. Summary
Now let's summarize the process of debugging shell scripts:
First use the "-n" option to check for syntax errors, then use the "-x" option to track the execution of the script. Before using the "-x" option, don't forget to customize the value of the PS4 variable to enhance the output information of the "-x" option , At least it should be output line number information (first execute export PS4 = '+ [$ LINENO]', a more permanent solution is to add this statement to the .bash_profile file in your user's home directory), which will enable The debugging journey is easier. You can also use traps, debugging hooks and other means to output key debugging information, quickly narrow down the scope of troubleshooting errors, and use "set -x" and "set + x" in the script to focus on certain code blocks. With such a variety of methods, I believe you can easily catch the bugs in your shell script. If your script is complex enough and you need more debugging capabilities, you can use the shell debugger bashdb, which is a debugging tool similar to GDB, which can complete the breakpoint setting of the shell script, single-step execution, variable observation, etc. Function, using bashdb will also be of great benefit to reading and understanding complex shell scripts. The installation and use of bashdb is beyond the scope of this article. You can refer to the document on http://bashdb.sourceforge.net/ and download it for trial.