Best practices for writing shell scripts

Source: Internet
Author: User
Tags readable python script shebang shellcheck


Objective


The shell script was recently restarted due to job requirements. Although most of the commands are often used by themselves, they are always hard to read when written in scripts. And I always find it hard to read when I look at scripts written by others. After all, shell scripting is not a serious programming language, it's more like a tool to make a lot of different programs for us to invoke. So a lot of people write when they think of where to write, basically like a long main function, can't bear to look straight. At the same time, for historical reasons, there are many different versions of the shell, and there are many commands that have the same function that require our trade-offs so that the code's specifications are difficult to unify.
With these reasons in mind, I looked at some of the relevant documents and found that many people have thought about them and formed some good articles, but they are still a bit fragmented. So I'm just going to sort this out a little bit here, as a technical specification for my own scripting later.


Code style specification with "snake wand" at the beginning


The so-called shebang is actually in the first line of many scripts to appear in the "#!" At the beginning of the note, he pointed out that when we did not specify the interpreter when the default interpreter, the general may be the following:




Of course, there are many different types of interpreters, except bash, where we can look at the native supported interpreters with the following command:




When we use./a.shit directly to execute the script, if there is no shebang, then it will default to$SHELLthe specified interpreter, otherwise the interpreter specified by Shebang will be used.
However, the above may not be very adaptable, generally we will use the following way to specify:




This is the way we recommend it.


Code has comments


Note, it's obviously a common sense, but it's important to emphasize that this is especially true in shell scripting. Because many single-line shell commands are not so easy to understand, it can be especially important to maintain them without comments.
The meaning of the note is not only to explain the purpose, but to tell us the note, like a readme.
Specifically, for shell scripts, annotations typically include the following sections:


    1. Shebang
    2. Parameters of the script
    3. Purpose of the script
    4. Considerations for scripting
    5. Script writing time, author, copyright, etc.
    6. Explanatory notes before each function
    7. Some more complex single-line command comments
Parameters to Specification


This is important, when our script needs to accept parameters, we must first determine whether the parameters conform to the specification, and give the appropriate echo, user-friendly understanding of the use of parameters.
least, at least, we have to judge the number of parameters:


if [[$#! = 2]];then    echo ' Parameter incorrect.]    


Variables and Magic Numbers



In general, we will define some important environment variables at the beginning to ensure the existence of these variables.




There is a very common use of this definition, and the most typical application is that when we install a lot of Java versions locally, we may need to specify a Java to use. Then we'll redefine the script at the beginningJAVA_HOMEandPATHcontrol the variables.



At the same time, a good code is usually not a lot of hard coding in the code "magic number". If it must be, it is usually defined in the form of a variable at the beginning, and then called when the variable is called directly, so as to facilitate future modification.


Indentation has rules.


For shell scripts, indentation is a big problem. Because many places need to be indented (such as the IF,FOR statement) are not long, all many people are lazy to indent, and many people are not accustomed to using functions, resulting in the indentation function is weakened.
In fact, the correct indentation is very important, especially when writing functions, otherwise we can easily read the function of the body with the direct execution of the command confused.
The most common indentation methods are "soft tab" and "Hard tab".


    • The so-called soft tab is indented using n spaces (n is usually 2 or 4)
    • The so-called hard tab, of course, refers to the real "\ T" character
      This is the best way to not rip, only to say that each has its merits and demerits. I'm used to hard tab anyway.
      For the IF and for statements, we'd better not write a single line then,do these keywords, so it looks ugly ...
Named with standard


The so-called naming specification, basically contains the following points:


    1. File name specification, ending with. SH for easy identification
    2. Variable name must have meaning, do not misspell
    3. Unified naming style, write shell generally with lowercase letters underlined
Coding to be unified


Try to use UTF-8 encoding when writing scripts, can support some strange characters such as Chinese. Although can write in Chinese, but in writing comments and playing log when the English as much as possible, after all, many machines still do not directly support Chinese, typing may be garbled.


Permissions remember to add


Although this is very small, but I often forget that without enforcing the permission will not be directly executed, a bit annoying ...


Log and Echo


The importance of the log does not need to be said, it is convenient for us to go back to error correction, in large-scale projects is very important.
If the script is intended to be used directly by the user at the command line, then it is better to be able to perform the execution process in real time, which is easy for the user to control.
Sometimes in order to improve the user experience, we will add some special effects in the echo, such as color ah, flashing ah and so on, specifically can refer to ansi/vt100 Control sequences this article introduction.


Password to remove


Don't hard code The password in the script, do not hard code in the script, do not hard code in the script.
Important things to say three times, especially if the script is hosted on a platform like GitHub ...


Too long to Branch


In order to ensure a good reading experience, the parameters may be very long when invoking certain programs, and we can use backslashes to branch:




Note there is a space before the backslash.


Coding Detail Specification code is efficient


When using commands, be aware of the specifics of the command, especially when the amount of data processing is large, and always consider whether the command will affect efficiency.
For example, the following two SED commands:




Their role is to get the first line of the file. But the first command reads the entire file, and the second command reads only the first line. When a file is large, just such a command is not the same, resulting in huge efficiency differences.
Of course, this is just to give an example, the real correct usage of this example should be to use thehead -n1 filecommand ...


Use double quotation marks frequently


Almost all the big guys recommend double quotes when using "$" to get a variable.
Not adding double quotes can cause a lot of trouble in many cases. To give an example:




The results of his operation are as follows:




Why is that? In fact, it can be interpreted as executing the following command:




In many cases, when variables are used as parameters, it is important to pay attention to the above point and carefully understand the differences. The above is just a very small example, the actual application due to this detail caused by the problem is too many ...


Using the main function skillfully


We know that compiled languages like Java,c have a function entry, which makes the code very readable, and we know what to do directly, and those are functions. But the script is different, the script belongs to the explanatory language, executes directly from the first line to the last line, if the command and function melted in this, it is very difficult to read.
As Python's friends know, a standard Python script is generally at least like this:


#!/usr/bin/env pythondef func1 ():    passdef Func2 ():    passif __name__== ' __main__ ':    func1 ()    


He used a very ingenious method to implement the main function we used to make the code more readable.
In the shell, we also have similar tips:


#!/usr/bin/env bashfunc1 () {    #do sth}func2 () {    #do sth}main () {    func1    


We can use this notation to implement similar main functions, which makes the script more structured.


Consider scopes


The default scope of variables in the shell is global, such as the following script:


#!/usr/bin/env Bashvar=1func () {    


His output is 2 instead of 1, which is obviously not in line with our coding habits and can easily cause problems.
Therefore, rather than using global variables directly, we'd better uselocal readonlysuch commands, and then we can usedeclarethem to declare variables. These are better than using a global approach definition.


Using Heredocs skillfully


The so-called Heredocs, also can be considered as a multi-line input method, that is, after the "<<" to set an identifier, and then we can enter multiple lines of content until the identifier is encountered again.
With Heredocs, we can easily generate some template files:




Learn to check the path



In many cases, we will first get the path of the current script, and then the path to the baseline, to find other paths. Usually we are usingpwdthe path directly in order to get the script.
But in fact this is not rigorous,pwdget is the current shell execution path, not the current script execution path.
The right approach should be two of the following:




You should first CD into the directory of the current script and then PWD, or directly read the path where the current script is located.


Code to be brief


The brevity here is not just the length of the code, but the number of commands used. In principle we should be able to solve the problem by a single order by no more than two orders. This involves not only the readability of the code, but also the efficiency of the execution of the code.
The most classic examples are as follows:




Cat commands the most despised usage is this, with no meaning, clearly an order can be solved, he had to add a pipe ...


Use new wording


The new wording here does not mean how much, but rather we may prefer to use some of the more recent syntax, more biased in code style, such as


    1. Use it as much as possiblefunc(){}to define functions instead offunc{}
    2. Try to use it[[]]instead[]
    3. Try to$()assign the result of the command to a variable instead of the inverted quotation mark
    4. Use printf instead of echo in complex scenarios to perform echoes


In fact, many of these new features are more powerful than the old ones, and they are known when used.


Other small tip


Given that there are still a lot of bits and pieces, it will not unfold, here is a brief mention.


    • Path as far as possible to maintain absolute path, the number of paths is not prone to error, if you want to use a relative path, preferably.
    • Prioritize using Bash's variable substitution instead of awk sed, which is shorter
    • Simple if try to use && | |, write a single line. Like what[[ x > 2]] && echo x
    • When the export variable, try to add the namespace of the sub-script, ensure that the variable does not conflict
    • Traps are used to capture the signal and perform some finishing touches when the termination signal is received
    • Use Mktemp to generate temporary files or folders
    • Use/dev/null to filter unfriendly output information
    • Uses the return value of the command to determine the execution of the command
    • To determine whether a file exists before using the file, do a good job of exception handling
    • Do not process the data after LS (for examplels -l | awk ‘{ print $8 }‘), the result of LS is very uncertain, and the platform is related
    • Do not use the For loop when reading the file and use the while read
Static Check tool Shellcheck Overview


In order to guarantee the quality of the script from the system, our simplest idea is probably to have a static checking tool, which can be used to make up the knowledge blind spot that the developer may exist.
In the market for the shell of static check tool is really not much, find to find a tool called Shellcheck , open source on GitHub, there are more than 8K star, it seems to be very reliable. We can go to his homepage to learn more about the specific installation and usage information.


Installation


The tool supports a wide variety of platforms, and he supports at least the mainstream package management tools for various platforms such as Debian,arch,gentoo,epel,fedora,os X,opensuse. Easy to install. Specific reference to the installation documentation


Integration


Since it is a static check tool, it can be integrated in the CI framework, and shellcheck can be easily integrated into the Travis CI for static checking of projects in the main language of the shell script.


Sample Example


In the gallery of bad code of the document, it also provides a very detailed standard of "dirty", with a very good reference value, can be in idle time as "Java puzzlers" such as books to read or very comfortable.


Essence


However, in fact, I think the most essential part of the project is not the above function, but he provides a very very powerful wiki. In this wiki, we can find the basis for all the judgments of this tool. Here, each detected problem can be found in the wiki of the corresponding problem number, he not only told us "This is not good", but also told us "why this is not good", "how we should write", it is very suitable for further study of the party.


Resources
    • 10 best Practices for Shell script programming
    • Shell Scripting Specification
    • Shellcheck Tool
    • Best practices for Writing Bash Scripts
    • Good coding practices for bash
    • Design patterns or best practices for shell scripts
    • Bashstyle (GITHUB)
    • Bashguide/practices
    • Obsolete and deprecated syntax
    • ansi/vt100 Control Sequences


Best practices for writing shell scripts


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.