LinuxThree Musketeers--Awk
dking~ Sharing
1.1 awkExecution Process
Awk reads in the first line of content
determine if the conditions in the pattern are met nr>=2(whether to let you enter the gate)
if the match executes the corresponding action {Print $}
if the condition is not matched, continue reading the next line
Continue reading next line
repeat the process until the last line is read ( EOF:End of field)
1.2Records and Fields
Record indicates that records, row records are equivalent to rows
Field indicates that the area, fields field is equivalent to column
1.2.1record (line)
awk By default each row is a record
The RS(record separator) is the input data record delimiter, which represents the delimiter for each record entry, separating the row from the line, that is, the RS represents the end flag for each line
NR(number of records) record line numbers, which represent the numbers of records (rows) currently being processed, and are automatically +1 when reading new lines
ORS(output record separator) print delimiter
Awk uses the built-in variable RS to hold the input record delimiter,and RS represents the input record delimiter, which can be redefined by the BEGIN module
1.2.2fields (columns)
Each record is made up of multiple regions (field), and by default the separators between the zones are delimited by spaces, and the separators are recorded in the built-in variable FS, with the number of regions recorded per row saved in awk the built-in variables NF in
FS(field separator) input field (column) separator, separator is a chopper, cut a line of string into multiple areas
Awk-f is actually modifying the contents of FS, equivalent to awk ' begin{fs= ': "} '
NF(number of fields) represents the count of columns in a row, that is, how many columns are in each row
OFS output field (column) separator
the $ symbol indicates that a column is taken,$ $NF
through RS,FS determines how awk reads the data, and can modify the values of ORS,OFS to specify how awk outputs data, for example:
1.3 awknew parameters and symbols
~ indicates a match, including
the!~ represents a mismatch and does not contain
Gsub function, used to replace
Gsub (/ What you are looking for /, " what to replace ", the first few columns )
I=i+1 equivalent to i++ Count
i=i+ $n i+= $n Cumulative sum
-V Var=value assigns a user-defined variable to pass an external variable to awk
1.3.1 Gsubthe replacement function
# awk ' {gsub (/:/, "¥", $NF);p rint} ' reg.txt
Zhang Dandan 41117397¥250¥100¥175
Zhang Xiaoyu 390320151¥155¥90¥201
Gsub function
Gsub (/What you are looking for/, "What to replace", the first few columns)
1.3.2 i++Number of statistics
# seq |awk ' {i=i+1;print i} ' count and show process
# seq |awk ' {i=i+1}end{print i} ' count shows only the final result
I=i+1 can be expressed as i++
# seq |awk ' {i=i+$1}end{print i} ' calculates the sum of the 1th number of columns
# seq |awk ' {i+= $n}end{print i} ' i+= $n equivalent to i=i+ $n
1.4 awkPatterns and Actions
awk [Options] ' pattern {action} ' file
awk parameter mode action
in layman's terms, what to do with a given condition,and awk ' find who { what }' mode { action }'
common patterns of awk:
Regular Expressions as patterns Support BRE ERE
comparing expressions as Patterns Nr>10
Range Mode
Special Mode BEGIN and END
1.4.1The regular expression is used as a pattern:
Instance 1-1 finding content from a specific column
# awk ' $3~/0+/'/server/files/reg.txt
Zhang Xiaoyu 390320151:155:90:201
Instance 1-2 $ (NF-1) differs from $NF-1
# awk-f ' [:]+ '/zhang/{print $1,$2,$ (NF-1)} ' Reg.txt
Zhang Dandan 100
Zhang Xiaoyu 90
# awk-f ' [:]+ '/zhang/{print $1,$2, $NF-1} ' Reg.txt
Zhang Dandan 174
Zhang Xiaoyu 200
1.4.2Range Mode
The scope pattern is simple to understand where to come from and where to go
match from Condition 1 to scope described in condition 2
# awk ' nr==1,nr==3 ' reg.txt
Zhang Dandan 41117397:250:100:175
Zhang Xiaoyu 390320151:155:90:201
Meng Feixue 80042789:250:60:50
1.5 awkArray
indexed by numbers, easy and quick to query content, array index can be numbers and strings. In awk, An array is called an associative array (associative arrays). The array in awk does not have to be declared in advance, nor does it have to declare size. Array elements are initialized with 0 or an empty string, depending on the context.
# awk ' begin{h[1]= "Dou", h[2]= "Qin"; h[3]= "Feng";p rint H[1],h[2],h[3]} '
Dou Qin Feng
1.5.1Analyzing user logon logs
Identify the attacker's IP address and number of attacks
# awk ' $6~/failed/{h[$ (NF-3)]++}end{for (Pol in h) print Pol,h[pol]} ' secure-20161219 |sort-nk2|column-t|tail
Identify the users attacked and the number of attacks
# awk ' $6~/failed/{h[$ (NF-5)]++}end{for (Pol in h) print Pol,h[pol]} ' secure-20161219 |sort-nk2|column-t|tail
1.5.2Sort Command Summary1.5.3 sort
Sort the files in order
-B: Ignores whitespace characters that begin before each line;
-C: Check whether the file has been sorted in order;
-D: When sorting, ignore other characters while processing English letters, numbers and space characters;
-F: When sorting, lowercase letters are treated as uppercase letters;
-N: According to the size of the numerical order;
-o< output file;: Deposit the sorted result into the prepared document;
-r: Sort in reverse order;
1.5.4 Uniq
Uniq used to report or ignore duplicate rows in a file, typically used in conjunction with the sort command
-C or--count: Displays the number of occurrences of the row next to each column;
-D or--repeated: Displays only the rows that appear repeatedly;
-U or--unique: Displays only the rows that appear once;
1.5.5 column
Column Portrait list
-T automatically aligns with a space-based benchmark
Linux Three Musketeers--awk