Introduction to awk commands and awk commands
Awk is a powerful text analysis tool. Compared with grep search and sed editing, awk is particularly powerful in data analysis and report generation. To put it simply, awk refers to reading files row by row. Each line is sliced with spaces as the default separator, and the cut part is analyzed and processed.
Awk has three different versions: awk, nawk, and gawk, which are generally gawk and gawk is the GNU version of AWK.
Awk is named from the first letter of its founder Alfred Aho, Peter Weinberger, and Brian Kernighan. In fact, AWK does have its own language: AWK programming language. The three creators have formally defined it as "style scanning and processing language ". It allows you to create short programs that read input files, Sort data, process data, perform calculations on input, and generate reports. There are countless other functions.
Style Processing
cat /etc/passwd|head -5#resultroot:x:0:0:root:/root:/bin/bashbin:x:1:1:bin:/bin:/sbin/nologindaemon:x:2:2:daemon:/sbin:/sbin/nologinadm:x:3:4:adm:/var/adm:/sbin/nologinlp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
After processing:
cat /etc/passwd|head -10|awk -F ':' 'BEGIN {print "name,shell"} {print $1","$7} END {print "blue,/bin/nosh"}'#resultname,shellroot,/bin/bashbin,/sbin/nologindaemon,/sbin/nologinadm,/sbin/nologinlp,/sbin/nologinblue,/bin/nosh
The awk workflow is as follows: first execute BEGING, then read the file, read a record with/n line breaks, then divide the record into Domains Based on the specified domain separator, and fill in the domain, $0 indicates all domains, $1 indicates the first domain, $ n indicates the nth domain, and then starts the action corresponding to the execution mode. Then read the second record until all the records are read and the END operation is executed.
Template Matching
awk '{pattern + action}' {filenames}
Pattern indicates the content that awk looks for in the data, and action is a series of commands executed when matching content is found. Curly braces ({}) do not always appear in the program, but they are used to group A series of commands according to a specific mode. Pattern is the regular expression to be expressed and enclosed by a slash.
For example, search for all rows with the root keyword in/etc/passwd.
awk -F: '/root/' /etc/passwd#resultroot:x:0:0:root:/root:/bin/bashoperator:x:11:0:operator:/root:/sbin/nologin
This is an example of pattern. Only the row matching pattern (root here) can execute action (no action is specified, and the content of each row is output by default ). Regular Expressions are supported in search, for example, awk-F: '/^ root/'/etc/passwd.
Search for all rows with the root keyword in/etc/passwd and display the corresponding shell
awk -F: '/root/{print $7}' /etc/passwd#result/bin/bash/sbin/nologin
Awk built-in Variables
Awk has many built-in variables used to set environment information. These variables can be changed. The following lists the most common variables.
ARGC command line parameter count ARGV command line parameter arrangement ENVIRON support system environment variables in the queue using FILENAME awk browsed file names FNR browsed file records NR read records NF browsed records domain number of FS sets the input domain separator, equivalent to the command line-F option OFS output domain delimiter RS control record separator ORS output record Separator
Statistics/etc/passwd: file name, row number of each row, column number of each row, corresponding to the complete row content:
awk -F ':' '{print "filename:" FILENAME ",linenumber:" NR ",columns:" NF ",linecontent:"$0}' /etc/passwd
Use printf instead of print to make the code more concise and easy to read
awk -F ':' '{printf("filename:%10s,linenumber:%s,columns:%s,linecontent:%s\n",FILENAME,NR,NF,$0)}' /etc/passwd
Both print and printf are provided in awk. The print function can be a variable, a value, or a string. The string must be referenced in double quotation marks and the parameters must be separated by commas. If there are no commas (,), the parameters are connected together and cannot be distinguished. The printf function is similar to the printf function in C language. It can format strings. When the output is complex, printf is easier to use and the code is easier to understand.
Awk Programming
Variables and assignments
In addition to the built-in variables of awk, awk can also customize variables. The following table lists the number of accounts in/etc/passwd.
awk '{count++;print $0;} END{print "user count is ", count}' /etc/passwd root:x:0:0:root:/root:/bin/bash ... user count is 40
The count is not initialized here. Although the default value is 0, it is recommended to initialize it as 0:
awk 'BEGIN {count=0;print "[start]user count is ", count} {count=count+1;print $0;} END{print "[end]user count is ", count}' /etc/passwd [start]user count is 0 root:x:0:0:root:/root:/bin/bash ... [end]user count is 40
Count the number of bytes occupied by files in a folder
ls -l |awk 'BEGIN {size=0;} {size=size+$5;} END{print "[end]size is ", size}' [end]size is 439289
If the unit is M:
ls -l |awk 'BEGIN {size=0;} {size=size+$5;} END{print "[end]size is ", size/1024/1024,"M"}' [end]size is 0.418939 M
Condition Statement
The condition statements in the awk are used for reference in the C language. See the following declaration method:
if (expression) { statement; statement; ... ...}if (expression) { statement;} else { statement2;}if (expression) { statement1;} else if (expression1) { statement2;} else { statement3;}
Loop statement
The loop statements in awk are also used in C language and support while, do/while, for, break, and continue. These keywords have the same semantics as those in C language.
Array
Because the subscript of an array in awk can be numbers and letters, the subscript of an array is usually called a key ). Both values and keywords are stored in an internal table that uses hash for key/value applications. Because hash is not stored in sequence, you will find that the array content is not displayed in the expected order. Arrays and variables are automatically created when they are used, and awk automatically determines whether they are stored as numbers or strings. In general, arrays in awk are used to collect information from records. They can be used to calculate the sum, count words, and track the number of times the template is matched.
Show/etc/passwd account
awk -F ':' 'BEGIN {count=0;} {name[count] = $1;count++;}; END{for (i = 0; i < NR; i++) print i, name[i]}' /etc/passwd0 root1 bin2 daemon3 adm4 lp...
Calculate the difference set of two files
awk '{if(NR==FNR){a[$1]=0}if(NR!=FNR && !($1 in a))print}' file1 file2
Awk programming content is very much, here only lists simple common usage, more please refer to http://www.gnu.org/software/gawk/manual/gawk.html
In shell, what is $0 of the awk command?
Awk is used to process the statement in units, and the statement in "{}" is executed for each line in 1.txt.
Two terminologies in awk:
Record (each row of text by default)
Field (by default, it is a string separated by spaces or tabs in each record)
$0 indicates a record, and $1 indicates the first field in the record.
Generally, print $0 is used to print the entire line (a backslash is not required before $0). print $1 indicates that only the first field of each line is printed.
How to Use the linux awk command?
Awk: used to split a row into several "fields" for processing. Suitable for processing small data.
Running Mode: awk 'condition type 1 {Action 1} Condition Type 2 {Action 2}... 'filename
# Last | awk '{print $1 "\ t" $3}' <= view registrant's data. Only the logon name and IP address are displayed and separated by [tab ].
Awk built-in Variables
Meanings of variable names
Total number of fields owned by each line of NF ($0)
NR the current awk processes the "nth row" Data
FS current delimiter, default Space key
Logical operators of awk
Meaning of the computing unit
> Greater
<Less
> = Greater than or equal
<= Less than or equal
= Equal
! = Not equal
Example:
Cat/etc/passwd | awk '{FS = ": "} $3 <10 {print $1" \ t "$3} '<= file/etc/passwd is separated, view the data smaller than 10 in the third column, and only the accounts and third columns are displayed.
The above is my summary of awk and I hope it will help you. I wrote it, not just copy it.