Shell Script Learning Guide [II] (Arnold Robbins & Nelson h.f Beebe) _linux Shell

Source: Internet
Author: User
Tags arithmetic arithmetic operators exit in locale posix processing text

It's time to enter the fourth chapter, just see a post title: I have a good talent, but unfortunately I am a girl. Khan ~ this ... Music Without Borders, this should not have no gender sector?

Fourth Chapter Text Processing tools

The book first describes the following rules of ordering, the value is needless to say, the size of the big should be small, but the character type is often differentiated tone or accent. Enter locale on the command line to view the encoding configuration of your system. The default is in the system configuration, but you can set your own sorting code. Such as:

Copy Code code as follows:

$ lc_all=c Sort French-english #以传统ASCII码顺序排序

The following sorts command sort is described below:
Syntax: sort [Options] [File (s}]
Primary option:-B ignores opening whitespace
-C Check whether the input has been sorted correctly. If not sorted, the exit code is not 0 and there is no output
-D dictionary order, only text numbers and whitespace makes sense.
-G Generic value: Compares a field with a floating-point number type. The GNU version only provides this option feature
-F considers mixed letters to be the same case, ignoring case.
-I ignores characters that cannot be printed.
-K defines the sort key value field
-M merges the sorted input file into a sorted output data stream
-n compares fields by integer type
-o outfile writes output to the specified file
-R inverted sort from large to small, default from small to large
-T char replaces white space characters with a single character char as the default field delimiter
-U has only a unique record, discarding all records with the same key value leave only the first one.
In addition, the sort key Value field type identification, that is, the modifier letter after a field of-K:
b ignores the opening whitespace
D Dictionary order
F is case insensitive
G is compared to a general floating-point number and only applies to the GNU version
I ignore characters that cannot be printed
n is compared with an integer number
R Inverted Sort order
Characters in fields and fields are numbered starting with 1. If you specify only one field number, the sort key value starts at the beginning of the field and continues until the end of the record (not the end of the field).
If you give a pair of field numbers separated by commas, the sort key value starts with the first field value and ends at the end of the second specified field. Multiple-K may appear, starting from the first.


Copy Code code as follows:

$ sort-t:-k1,1/etc/passwd #以用户名称排序
$ sort-t:-k2nr/etc/passwd #反向UID的排序
$ sort-t:-K4N-K3N/ETC/PASSWD $ to GID and UID sort

About the efficiency of the sort, the algorithm is more understanding of the current efficiency of the sorting algorithm, where the sort is not special, similar to the STL estimation, combinatorial sorting algorithm as far as possible optimized. Not the algorithm of the children's shoes also need not understand, I believe that its efficiency is good.

Sometimes we are also very concerned about the stability of the order, by default is unstable, but GNU implementation of the Coreutils package to make up for the problem, you can use the--stable option to solve the stability issue. (Do not understand the stability of a simple saying: The meaning is that the sort key value is equivalent to the time required to output in the order of input, that is, sorting does not upset the input order)

Sometimes we also need to solve the problem of duplication of input data, sort-u can solve some, but it eliminates the operation is based on matching key values, not matching records. The Uniq command provides another way to filter data: It is commonly used in pipelines to delete duplicate records that have been applied to sort sorting. Uniq has 3 handy options:-C Adds the number of repetitions to the row before each output line. The-D option is used to display only duplicate rows. -U displays only rows that are not duplicates. One thing to note here is that the uniq needs to sort the data before processing it!

In addition, when we are dealing with a large amount of such data, we need to reformat the paragraph to make it easier for us to use or read. At this time you can use the FMT command, there are two common options:-S to cut only a long line, short lines will not merge; -W n Sets the output line width to n characters (the default is 75 or so). To consider FMT portability, please consult the documentation separately.

Here is a description of the WC commands that you might use to count lines, words, and characters, with the-C-byte number-----------------W By default, the number of bytes to travel is counted.

Well, with so much text, we might want to print it out and see that the print features supported in UNIX include two different kinds of commands, but with the same functionality, commercial UNIX systems and Gnu/linux typically support both, but BSD systems only support Berkeley style, POSIX defines only the LP command.

Copy Code code as follows:

Berkeley System V Purpose
LPR LP transfer files to print queues
LPRM Cancel deletes a file from the print queue
LPQ Lpstat Report Queue status

Examples of two sets of commands:

Copy Code code as follows:

$ lpr-plcb102 #将PostScript文件传送给打印队列lcb102
$ lpq-plcb102 #查看打印队列状态
$ lprm-plcb102 81352 #停止此进程! Finish the job.

And then the System V style:

Copy Code code as follows:

$ lp-d lcb102 #传送PostScript文件到打印队列lcb102
$ lpstat-t lcb102 #查看打印队列
$ cancel lcb102-81355 #结束这个作业

Sometimes you need to print your data with page numbers or timestamps, and you can use PR to preprocess the data you want to print.
Syntax: PR [Options] [File (s)]
Main options:
-CN produces n-column output that can be simplified to-n
-F The title of an ASCII page character is placed before the title of each page after the first sheet (f in some circumstances)
-H ALTHDR replaces the name of the file in the page header with a string ALTHDR instead.
-LN produces n rows of pages
-on output offset N blank
-T does not display title
-wn is at most n characters per line. In the case of a single column output, long line splits are circled back to another line if necessary, otherwise, in the case of multiple column output, long lines are truncated to fit the designation. Sample Example:
Pr-f-l60-o10-w65 file (s) | Lp.

There are other printing tools, said here is relatively simple, there is the need to search some more documents to see.

Fifth chapter The Magic of the pipe
In Linux, most of the administrative files are text files that can be edited directly, most of which are in the standard directory:/etc. We write shell scripts most of the time are processing text information, and the pipeline can be used in order to be connected to the use of ... ... | ... In this way, the book has an example of a passwd file with 5 tubes attached to it. Then a script was written to convert the text into an HTML file. And then another script to help you play the text-decryption game based on a regular match. And then through the pipeline to calculate a variety of Shakespeare basic words appear frequency and so on. The magic of the pipe is not wordy.

Sixth chapter variables, judgments, repetitive movements
There are two similar commands that provide the management of variables, one is readonly, you can set variables to read-only mode, or become symbolic constants. Export is used to modify or print environment variables. They all consist of a-p option, meaning the name of the Print command and all the names and values of the exported (read-only) variables, which allows the shell to reread the output to re-establish the environment (read-only setting).

Export-p can display all current environment variables and, if you want to remove variables from the program's environment, use the ENV command, or you can temporarily change the environment variable values:
Env-i path= $PATH home= $HOME lc_all=c ...
The-i option is used to initialize the (initializes) environment variable, which is to discard any inherited values, passing only the variables specified on the command line to the program.

The unset command deletes variables and functions from the executing shell, which, by default, can be set by the variable or-V complete:
unset full_name #删除full_name变量
Unset-v firest Middle Last #删除多个变量
Unset-f full_function #删除函数
Here I tried to delete the readonly variable with unset and found that it could not be deleted. It then queries the following and finds that the constant declaration cannot be changed, including deletion, and only the current shell is logged off.

Sometimes when you output a variable, you want to connect another character, you can add curly braces around the variable name, such as:
Echo _${myvar}_ #这样会输出myvar变量并在前后增加下划线.
This is called the expansion of the parameter. If the variable is undefined, the expansion is null.

There is also a substitution operator:
${varname:-word} #如果varname存在且非null, it returns its value, otherwise it returns word.
${varname:=word} #如果varname存在且非null, it returns its value or sets it to word and then returns its value.
${varname:?message} #如果varname存在且非null, it returns its value, otherwise the varname:message is displayed, and the current command or script is exited, if the default message appears parameter null or NET Set.
${varname:+word} #如果varname存在且非null, return word, or null.

The colon (:) in each of the above operators is optional. If the colon is omitted, the "existing and NOT NULL" section of each definition is changed to "exist", that is, the operator is used only to test whether the variable exists.

There is also pattern matching operator #:
${variable#pattern} #如果模式匹配于变量值的开头处, deletes the shortest part of the match and returns the remaining portion.
${variable# #pattern} #如果模式匹配于变量值的开头处, deletes the longest matching part and returns the remaining portion.
${variable%pattern} #如果模式匹配于变量的结尾处, deletes the shortest part of the match and returns the remaining portion.
${variable%%pattern} #如果模式匹配于变量值的结尾处, deletes the longest portion of the match and returns the remaining portion.
Finally, POSIX standardizes the string length operator: ${#variable} returns the character length of the $variable value.

Learning here we can combine the position parameters used before to do some of the fault-tolerant processing of the script, such as: Filename=${1:-/dev/tty} #如果参数1为空则返回/dev/tty
We did not describe how to access the total number of parameters passed, as explained here, with the $# fit. Like what:

Copy Code code as follows:

While [$#!=0]
Case is in
. #处理第一个参数
Shift #移除第一个参数

There are also $*, $@, which represent all command-line arguments at once. These two parameters can be used to pass command-line arguments to a program executed by a script or function.
"$*" means that all command-line arguments are treated as a single string, equivalent to "$ ...". The first character of the $IFS is used as a separator character to delimit different values to create a string.
"$@" treats all command-line arguments as separate individuals, which is a separate string. Equivalent to "$" "$" .... This is the best way to pass a parameter to another program because it retains any whitespace that is embedded in each parameter.

The shift command is used to "truncate (Lops off)" from the position parameter of the list, starting from the left. Once the initial value of the execution shift,$1 is lost forever, it is replaced by the old value of $. The $ value becomes the old value of $, and so on. $ #值则会逐次减一. The above several want to experiment more, no longer repeat.

Similarly, there are many special variables: (All references to the special variable plus the $ symbol)
# Number of parameters for the current process
@ The command line arguments passed to the current process. Within double quotes, expands to an individual parameter.
* Command line arguments for the current process. In double quotes, expands to a single parameter.
-Give the shell options when referencing.
? Exit status of Previous command
$ process ID of the shell process number
0 (0) The name of the shell program
! Process number of the most recent background command
ENV, once referenced, is used only in an interactive shell. The value of the $ENV is an expandable parameter.
Home root directory
IFS inside the field separator, think about awk.
The default name for LANG's current locale; Other lc_* variables will overwrite their values
Lc_all the name of the current locale, overwriting Lang with other lc_* variables
Lc_collate the current locale name used to sort characters
The name of the current locale used to determine the character category during lc_ctype pattern matching
Lc_messages the name of the current language of the output information
Lineno a line that has just been executed or a line number within a function
Nlspath $lc_messages (XSI) The information directory location given in the information language.
Find path for path command
PPID the process number of the parent process
PS1 the main command prompt string, default to "$"
PS2 line continues with the prompt string, default to ">"
PS4 the prompt string that executes the trace with the Set-x setting. The default is "+".
PWD the current working directory.

The shell's arithmetic operators are basically the same as the C language, and you want to test the arithmetic operator directly on the command line with the double brackets: echo $ ((3&4)).

There's a place to know, every command, whether built-in, shell, or external, when it exits, returns a small integer value to the program that references it, which is the exit status of the familiar program (Exit Statu). There are many ways to use a program's exit state when executing a process under the shell. In practice, the exit status of 0 indicates successful execution, and all other states fail. You can use the LS command to perform a single error once to see what the return status is (above a special variable $?) To view the return status of the previous command).

POSIX end State:
0 command to exit successfully
>0 fails during redirection or Word expansion (~, variables, commands, operator expansion, Word cutting).
The 1-125 command does not exit successfully, and the specific meaning is defined by individual commands.
The 126 command was found, but the file could not be executed.
The 127 command was not found.
The >128 order died because of the signal received.

Curiously, POSIX left exit status 128 undefined, requiring it to indicate some kind of failure. Because only the low 8 bits are returned to the parent process, the exit state greater than 255 is replaced by the remainder after the value is divided by 256. Return value command: Exit Value_number.

About Judgment Statement If-then-elif-else-fi statement to a syntax no longer repeat:

Copy Code code as follows:

If pipeline
[Pipeline ...]
[Elif Pipeline
[Pipeline ...]
Statements-if-true-2 ...]

If judgment you can use! , &&, | | The logical judgment symbols in C language.
Here's a test command that, in order to test the condition in the shell script and return its result by exiting the state, has a second form, [...], which is to note that the square brackets are literally typed verbatim, and must be separated from the expression in parentheses. For example: Test "$str 1" = "$str 2" is equivalent to ["$str 1" = "$str 2"]. Test has a lot of parameters ah, a lot of ... Man yourself (dare to use up 26 letters?!!! TT). Here's an improved version of the previous Finduser script:

Copy Code code as follows:

#! /bin/sh
#finduser---to find out if the user specified by the first parameter
If [$#-ne 1]
echo usage:finduser username >&2
Exit 1
who | grep $

About case statements, give examples no longer repeat, are very similar to the C language.

Copy Code code as follows:

Case $ in #测试 $
... #针对 program code for the-F option
;; # #类似break
-D | --directory) #支持长选项

*) #上边都不匹配的默认选项, not necessary
echo $1:unknow option >&2
Exit 1
;; #也非必须

For a For loop, give an example:

Copy Code code as follows:

For I in Atlbrochure*.xml
Echo $i
MV $i $i. Old
Sed ' s/athlanta/&, the south/' < $i. Old > $i

This loop backs up each original file as a file with a secondary file named. old, and then uses SED to process the file to create a new file. Also has the output file name, as a reminder of progress. In addition, the in list (lists) in the For loop is optional, and if omitted, the entire command-line argument is traversed, as if the for I in "$@" was entered.

While and until loops are similar, the syntax is:

Copy Code code as follows:

While condition

Until condition

The difference between the two is how to treat the condition exit state, as long as the condition succeeds, while the loop continues. As long as the condition is unsuccessful, the until loops.
In the loop above, you can still use break and continue, and function as C.

Shift mentioned before, it can also accept an optional parameter, that is, to move several.

There is a getopts command for parameter handling that simplifies option handling, which understands the use of multiple option letters in the POSIX option, and can be used to traverse the entire command-line argument, one argument at a time. This command automatically filters out symbols such as-,--in the parameters. If an illegal option letter is obtained, does the command return one? Symbol.

Functions in a shell script can generally be defined at the front of the program, or in a separate file, and can be used to fetch (source) them with the point number (.) command. Give a simple example:

Copy Code code as follows:

# Wait_for_user User [Sleeptime]
#语法: Wait_for_user user [Sleeptime]
Wait_for_user () {
Until who | grep "$" >/dev/null
Sleep ${2:-30}


Invokes direct Wait_for_user admin and can accept the second wait time parameter. In the Shell function, return works the same way as exit, returning a value, but it should be noted that using exit in the Shell function terminates the entire shell command.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.