Write robust Bash scripts many people use shell scripts to complete some simple tasks and become part of their lives. Unfortunately, shell scripts are greatly affected when an exception occurs. It is necessary to minimize such problems when writing scripts. In this article, I will introduce some techniques that make bash scripts robust. How many times have you crashed the script because you didn't initialize the variable using set-u www.2cto.com? For me, many times. Chroot = $1... rm-rf $ chroot/usr/share/doc if the above Code runs without parameters, you will not just delete the documents in chroot, but delete all documents in the system. So what should you do? Fortunately, bash provides set-u. When you use uninitialized variables, bash automatically exits. You can also use a set-o nounset with better readability. David % bash/tmp/shrink-chroot.sh $ chroot = david % bash-u/tmp/shrink-chroot.sh: line 3: $1: unbound variabledavid % use set-e. The beginning of each script you write should contain set-e. This tells bash to exit bash if any statement returns a non-real value. The advantage of using-e is to avoid errors from becoming serious errors like snowballs and capture errors as early as possible. More readable version: set-o errexit uses-e to free you from check errors. If you forget to check, bash will do it for you. But you cannot use $? To obtain the command execution status, because bash cannot obtain any non-0 return values. You can use another structure: commandif ["$? "-Ne 0]; then echo" command failed "; exit 1; fi can be replaced with: command | {echo" command failed "; exit 1;} or use: if! Command; then echo "command failed"; exit 1; fi what if you have to use a command that returns a non-zero value or are not interested in the returned value? You can use command | true, or if you have a long piece of code, you can temporarily disable the error check function, but I suggest you use it with caution. Set + ecommand1command2set-e documents point out that bash returns the value of the last command in the pipeline by default, maybe the one you don't want. For example, if the command is set to false or true, the command is considered successfully executed. If you want such a command to be considered as an execution failure, you can use the set-o pipefail program to defend against unexpected events. Your script may be run under an "unexpected" account, such as a lack of files or a directory that has not been created. You can prevent such errors. For example, if the parent directory does not exist after you create a directory, the mkdir command returns an error. If you add the-p option to the mkdir command when creating a directory, it will create the parent directory before creating the required directory. Another example is the rm command. If you want to delete a file that does not exist, it will "vomit" and your script will stop working. (Because you use the-e option, right ?) You can use the-f option to solve this problem. When the file does not exist, let the script continue to work. Prepare and process the space in the file name. Some people use spaces in the file name or command line parameters. Remember this when writing a script. Always remember to enclose variables with quotation marks. If [$ filename = "foo"]; when the $ filename variable contains spaces, it will be suspended. If ["$ filename" = "foo"]; when using the $ @ variable, you also need to use quotation marks, the two parameters separated by spaces are interpreted as two independent parts. David % foo () {for I in $ @; do echo $ I; done}; foo bar "baz quux" barbazquuxdavid % foo () {for I in "$ @"; do echo $ I; done}; foo bar "baz quux" barbaz quux I didn't think of any situations where "$ @" cannot be used, so when you have questions, the quotation marks are correct. If you use both find and xargs, you should use-print0 to split the file name, rather than line break. David % touch "foo bar" david % find | xargs lsls :. /foo: No such file or directoryls: bar: No such file or directorydavid % find-print0 | xargs-0 ls. /foo bar: The file system is unknown when the script you wrote fails. For example, you can lock the File status, temporary file status, or update a file before updating the next file. If you can solve these problems, whether it is to delete the lock file or roll back to the known status when the script encounters a problem, you are very good. Fortunately, bash provides a method to run a command or function when bash receives a UNIX signal. You can use the trap command. Trap command signal [signal...] you can link Multiple signals (the list can be obtained using kill-l), but to clear the mess, we only use three of them: INT, TERM, and EXIT. You can use-as to restore traps to the initial state. Signal description INT Interrupt-the TERM Terminate is triggered when someone uses Ctrl-C to Terminate the script-the EXIT Exit is triggered when someone uses kill to kill the script process-this is a pseudo signal, it is triggered when the script Exits normally or after set-e exits due to an error. When you use the lock file, you can write: if [! -E $ lockfile]; thentouch $ lockfilecritical-sectionrm $ lockfileelseecho "critical-section is already running" fi when the most important part (critical-section) is running, what happens if the script process is killed? The lock file will be thrown there, and your script will no longer run until it is deleted. Solution: if [! -E $ lockfile]; thentrap "rm-f $ lockfile; exit "int term EXITtouch $ lockfilecritical-sectionrm $ lockfiletrap-int term EXITelseecho" critical-section is already running "fi the lock file is deleted when you kill the process. Note that the script is explicitly exited in the trap command; otherwise, the script will continue to execute the command after the trap command. In the above lock file example, wikipedia has a race condition that must be pointed out, which exists between the judgment lock file and the creation of the lock file. A feasible solution is to use IO redirection and bash's noclobber (wikipedia) mode to redirect to a non-existent file. We can do this: if (set-o noclobber; echo "$"> "$ lockfile") 2>/dev/null; thentrap 'rm-f "$ lockfile "; exit $? 'Int TERM EXITcritical-sectionrm-f "$ lockfile" trap-int term EXITelseecho "Failed to acquire lockfile: $ lockfile" echo "held by $ (cat $ lockfile) "The more complicated problem with fi is that you need to update a large number of files, and whether you can make the script more elegant when there is a problem during the update process. You want to confirm which updates are correct and which ones are not changed at all. For example, you need a script to add users. Add_to_passwd $ usercp-a/etc/skel/home/$ userchown $ user/home/$ user-R when the disk space is insufficient or the process is killed midway through, this script will cause problems. In this case, you may wish that the user account does not exist and that his/her files should also be deleted. Rollback () {del_from_passwd $ userif [-e/home/$ user]; thenrm-rf/home/$ userfiexit} trap rollback int term EXITadd_to_passwd $ user cp-a/etc/skel/home/$ userchown $ user/home/$ user-Rtrap-INT TERM EXIT the rollback call must be disabled using trap at the end of the script, otherwise, when the script Exits normally, rollback will be called, so the script does not do anything. To maintain atomicity, you need to update a large number of files in the directory at a time. For example, you need to rewrite the URL to the domain name of another website. You may write: for file in $ (find/var/www-type f-name "*. html "); doperl-pi-e's/www.example.net/www.example.com/' $ filedone if it is modified to half of the script, some use www.example.com, and others use www.example.net. You can use backup and trap solutions, but your website URL is inconsistent during the upgrade process. The solution is to make this change an atomic operation. Make a copy of the data, update the URL in the copy, and replace the current working version with the copy. You need to confirm that the copy and the working version directory are in the same disk partition, so that you can take advantage of the Linux system, which moves the directory only to update the inode node pointed to by the Directory. Cp-a/var/www-tmpfor file in $ (find/var/www-tmp-type-f-name "*. html "); doperl-pi-e's/www.example.net/www.example.com/' $ filedonemv/var/www-oldmv/var/www-tmp/var/www, which means that if the update process goes wrong, online systems are not affected. The affected time of the online system is reduced to two mv operations, which is very short because the file system only updates inode and does not need to copy all the data. The disadvantage of this technology is that you need twice the disk space, and those processes that open files for a long time need a long time to upgrade to the new file version, we recommend that you restart these processes after the update is complete. This is not a problem for the apache server because it re-opens the file every time. You can use the lsof command to view the files currently being opened. The advantage is that you have a previous backup, which is useful when you need to restore it.