Shell programming --- awk command explanation
Awk Programming
Awk is a programming language. Gawk is the latest version. The current linux version uses gawk.
Awk is a soft link of gawk.
How awk works
BEGIN # Run the command before the object row is read.
The main input loop is executed repeatedly until the termination condition is triggered.
END # Run
Three methods for calling awk
1. Enter a command in the shell command line to call awk
# Awk [-f domain separator] 'awk cmd' file
2. Insert a script file in the awk program segment, and then use the awk command to call it.
# Awk-f 'awk. Sh' file
3. directly call the awk script
#./Awk file
========================================================== ==================
Awk mode matching
All awk statements are composed of pattern and action.
Mode is a set of rules used to test whether an input row needs to execute an action (mode determines when an action is triggered and an event is triggered)
An action is an execution process that contains statements, functions, and expressions)
Once there is a blank line in the test file, print it out. "This is a empty line !!! "
# Awk '/^ $/{print "This is a empty line !!! "} 'Test
========================================================== ==================
Records and Domains
Awk defines each input file line as a record, and each string in the line is defined as a field. The symbols that separate the fields are called domain delimiters.
Vim student
Li Hao njue 025-83481010
Zhang Ju nju 025-83466534
Wang Bin seu 025-83494883
Zhu Lin njupt 025-83680010
Print the 2, 1, 4, and 3 fields in the student file in sequence.
# Awk '{print $2, $1, $4, $3}' student
Print all fields in student
# Awk '{print $0}' student
List the values of the Third Field Based on the Calculation Expression
# Awk 'in in {one = 1; two = 2} {print $ (one + two)} 'student
Print the Third Field in the student file using the tab key Separator
# Awk 'in in {FS = "\ t"} {print $3} 'student
Li Hao, njue, 025-83481010
Zhang Ju, nju, 025-83466534
Wang Bin, seu, 025-83494883
Zhu Lin, njupt, 025-83680010
Print the content of all fields using the comma as the domain Separator
# Awk 'in in {FS = ","} {print $0} 'student
Use commas as the domain separator to print the content of fields 1 and 3.
# Awk 'in in {FS = ","} {print $1, $3} 'student
Abc d # two tabs in the middle
Abc d # A tab in the middle
Print the first domain and d (two methods)
# Awk 'in in {FS = "\ t"} {print $1, $2} 'xx
# Awk 'in in {FS = "\ t"} {print $1, $3} 'xx
Print the first domain and the above d
# Awk 'in in {FS = "\ t"} {print $1, $2} 'xx
========================================================== ======================================
Relational operators
<# Less
> # Greater
<= # Less than or equal
>=# Greater than or equal
==# Equal
! = # Not equal
~ # Matching Regular Expressions
!~ # Mismatched Regular Expressions
Print the first domain that matches the root content in the/etc/passwd file.
# Awk 'in in {FS = ":"} $1 ~ /Root/'/etc/passwd
Print the contents of all fields matching root in the/etc/passwd file.
# Awk 'in in {FS = ":"} $0 ~ /Root/'/etc/passwd
Print the content that does not match nologin in all domains in the/etc/passwd file.
# Awk 'in in {FS = ":"} $0 !~ /Nologin/'/etc/passwd
In the/etc/passwd file, if the UID is smaller than the GID, all matched values are output.
# Awk 'in in {FS = ":"} {if ($3 <$4) print $0} '/etc/passwd
In the/etc/passwd file, if the UID is greater than or equal to the GID, all matched values are output.
# Awk 'in in {FS = ":"} {if ($3 >=$ 4) print $0} '/etc/passwd
========================================================== ======================================
Boolean operator
| # Logical or
& # Logic and
! # Non-logical
Print the rows with UID 10 or GID 10 in the/etc/passwd file.
# Awk 'in in {FS = ":"} {if ($3 = 10 | $4 = 10) print $0} '/etc/passwd
Print the UID in the/etc/passwd file to be the same as the GID, and the matching information of the logon shell is the same.
# Awk 'in in {FS = ":"} {OFS = ":"} {if ($3 = $4 & $7 = "/sbin/nologin ") print $0} '/etc/passwd
Print the matching information of UID matching 10 or GID matching 10 in the/etc/passwd file.
# Awk 'in in {FS = ":"} {if ($3 ~ 10 | $4 ~ 10) print $0} '/etc/passwd
========================================================== ======================================
The expression awk expression is used to store, operate, and obtain data. An awk expression can be composed of values, character constants, variables, operators, functions, and regular expressions.
+ # Add
-# Subtraction
* # Multiplication
/# Division
% # Modulo
^ Or ** # Multiplication
++ X # Add 1 to the x variable before the x value is returned
X ++ # After the x value is returned, add 1 to the x variable
Use x + = 1 to list the number of empty rows in the file test.
# Awk '/^ $/{print x + = 1}' test
1st act as 0, and then recursively + 1 to list the number of empty rows in the test file
# Awk '/^ $/{print x ++}' test
1st Act 1, then recursively + 1 to list the number of empty rows in the test file
# Awk '/^ $/{print ++ x}' test
Instance
Li hao, njue, 025-83481010,85, 92,78, 94,88
Zhang Ju, nju, 025-83796534,89, 90,75, 90,86
Wang Bin, seu, 025-83494883,84, 88,80, 92,84
Zhu Lin, njupt, 025-83790010,98, 78,81, 87,76
Calculate the average value of student in the file
# Awk 'in in {FS = ", "} {total = $4 + $5 + $6 + $7 + $8} {avg = total/5} {print $1, avg} 'student
========================================================== ======================================
System variables: awk defines many built-in variables used to set environment variables. We call them system variables. These System variables are divided into two types:
1. Used to change the default value of awk, such as the domain Separator
2. It is used to define system values. These system values can be read during file processing, such as the number of domains in the record, the number of current records, the current file name, And awk dynamically change the value of the second system variable.
Awk environment variables and their meanings
# N # nth domain of the current record, separated by FS
#0 # All recorded Domains
ARGC # Number of command line parameters
ARGIND # location of the current file in the command line (starting with 0)
ARGV # array of command line parameters
CONVFMT # digital conversion format
ENVIRON # environment variable Association Array
ERRNO # description of the last system error
FILEDWIDTHS # field width list, separated by space keys
FILENAME # current file name
FNR # Number of browsing file records
FS # field separator. The default value is Space key.
IGONRECASE # Boolean variable. If it is true, case-insensitive matching is performed.
NF # number of domains in the current record
NR # Number of current records
OFMT # digital output format
OFS # output domain separator, default is Space key
ORS # output record delimiter, which is a line break by default
RLENGTH # String Length matched by the match Function
RS # record separator. The default value is Space key.
RSTART # The 1st position of the string matched by the match Function
SUBSEP # delimiter of array subscript, default value: \ 034
Output file student, and add the current number of records at the beginning, number of fields in the current record, and output file name at the bottom of the file
# Awk 'in in {FS = ","} {print NR, NF, $0} END {print FILENAME} 'student
========================================================== ======================================
The formatted output awk defines the printf output statement based on the C language syntax and specifies the output format.
Printf (format controller, parameter)
The printf statement contains two parts:
1. The format control operator starts with the "%" symbol to describe the format specification.
2. The parameter list, such as the variable name list, corresponds to the format controller and is the output object,
Printf modifier and Its Meaning
-# Left alignment
Width # the step of the field
. Prec # number of digits to the right of the decimal point
Printf format and its meaning
% C # ASCII characters
% D # integer
% E # Floating Point Number, scientific notation
% F # Floating Point Number
% O # octal
% S # string
% X # hexadecimal number
Print the string $2 and integer $8 in student, and run the tab after $2 and wrap the line after $8.
# Awk 'in in {FS = ","} {printf ("% s \ t % d \ n", $2, $8)} 'student
Convert number 65 to ASCII code
# Awk 'in in {printf ("% c \ n", 65 )}'
Converts 2014 to a floating point. The default decimal place is six digits.
# Awk 'in in {printf ("% f \ n", 2014 )}'
Print the first and third fields in the student file, and the string of the first field differs by 15 spaces from the string of the third field.
# Awk 'in in {FS = ","} {printf ("%-15s \ t % s \ n", $1, $3)} 'student
Print the first and third fields in the student file, and add the comment Name of the first field to the beginning of the row, and the comment Phonenumber of the third field.
# Awk 'in in {FS = ","; print "Name \ t \ tPhonenumber"} {printf ("%-16 s % s \ n", $1, $3)} 'student
The printed floating point number is controlled at 10, and the number of three digits after the decimal point is retained.
# Awk 'in in {printf ("% 10.3f \ n", 20141126)} '# if the number of digits is less than 10, fill in the space
2011111.000 full 10
Less than 10 in 201.000
========================================================== ======================================
The built-in string function awk provides powerful built-in string functions for character string replacement, search, and separation of text.
Awk string functions and their meanings
Gsub (r, s) # Replace r with s in the input file
Gsub (r, s, t) # Replace r with s in t
Index (s, t) # returns the position of the first t of the string in s.
Length (s) # returns the length of s.
Match (s, t) # test whether s contains a string matching t
Split (r, s, t) # divides r into series s on t
Sub (r, t, s) # Replace the r that appears 1st times in t with s
Substr (r, s) # returns the suffix starting from s in string r.
Substr (r, s, t) # returns the suffix of the string r starting from s to t.
In passwd, replace the root of the first domain with "My name is root", and print the output as follows:
# Awk 'in in {FS = ":"; OFS = ":"} gsub (/root/, "my name is root", $1) {print $0} '/etc/passwd
Replace the root of all fields with "My name is root" In passwd, and print the output as follows:
# Awk 'in in {FS = ":"; OFS = ":"} gsub (/root/, "My name is root") {print $0} '/etc/passwd
Print the first position where f appears in the abcdefg string.
# Awk 'in in {print index ("abcdefg", "f ")}'
Print the length of the string "This is a httpd server script"
# Awk 'in in {print length ("This is a httpd server script ")}'
Test "This is a httpd server script !!! "In this string! First position
# Awk 'in in {print match ("This is a httpd server sctipt !!! ","! ")}'
Test "This is a httpd server sctipt !!! "The first position of C in this string (case-insensitive)
# Awk 'in in {IGNORECASE = 1; print match ("This is a httpd server sctipt !!! ",/C /)}'
Set the string "This script is a httpd server script !!! "Replace the first sctipt that appears with the SCRIPT
# Awk 'in in {file = "This script is a httpd server script !!! "; Sub (/script/," SCRIPT ", file); printf (" % s \ n ", file )}'
Replace 10 of the first occurrence of the first field in the student text with 99
# Awk 'in in {FS = ","} {$1 ~ Li sub (/10/, "99", $0); print $0} 'student
Returns the suffix part of the file starting with 5th characters. file = "This script is a httpd server script !!! "
# Awk 'in in {file = "This script is a httpd server script !!! "; Print substr (file, 6 )}'
Returns the suffix of 9 starting from 6th characters in the file.
# Awk 'in in {file = "This script is a httpd server script !!! "; Print substr (file, 6, 9 )}'
Insert the row number and number of fields in the first record of the student file, and use the separator.
# Awk 'in in {FS = ","} {print NR, NF, $0} 'OFS = "." student