Introduction and use of AWK

Source: Internet
Author: User
AWK introduction and usage awk is a powerful text analysis tool. compared with grep search and sed editing, awk is particularly powerful in analyzing data and generating reports. To put it simply, awk is to read files row by row, and slice each line with an empty & amp; 26684; as the default delimiter, AWK is a powerful text analysis tool. compared with grep search and sed editing, awk is particularly powerful in data analysis and report generation. To put it simply, awk refers to reading files row by row. each line is sliced with spaces as the default separator, and the cut part is analyzed and processed.
Awk has three different versions: awk, nawk, and gawk, which are generally gawk and gawk is the GNU version of AWK.
Awk is named from the first letter of its founder Alfred Aho, Peter Weinberger, and Brian Kernighan. In fact, AWK does have its own language: AWK Programming Language. the three creators have formally defined it as "style scanning and processing language ".
It allows you to create short programs that read input files, sort data, process data, perform calculations on input, and generate reports. There are countless other functions.
Three methods to call AWK:
1. command line
Awk [-F field-separator] 'commands' input-file (s)
Commands is a real awk command, and the [-F domain separator] is optional. the default space is used. Input-file (s) is the file to be processed
2. shell script
Insert all the awk commands into a file and make the awk program executable. then, the awk command interpreter serves as the first line of the script and is called by typing the script name.
Equivalent to the first line of shell script :#! /Bin/sh :#! /Bin/awk
3. insert all the awk commands into a separate file, and then call:
Awk-f awk-script-file input-file (s) -- f option to load the awk script in awk-script-file. the input-file (s) is the same as above.
######################################## ######################################## ##################
The built-in variables of awk are used to set Environment Information. These variables can be changed. The following lists the most common variables.
The $0 variable indicates the entire record. $1 indicates the first domain of the current row, $2 indicates the second domain of the current row, and so on.
Number of ARGC command line parameters
ARGV command line parameter arrangement
ENVIRON supports the use of system environment variables in the queue
FILENAME awk browsed file name
Number of FNR browsing file records
FS sets the input domain separator, which is equivalent to the command line-F option
Number of NF browsing records
Number of records read by NR
OFS output domain separator
ORS output record separator
RS control record delimiter
The data of this experiment is as follows: --- a part of the ALERT log started from ORACLE
[Oracle @ bys3 ~] $ Cat awktest. log -- the number is manually added to the last two lines to facilitate the experiment.
MMAN started with pid = 9, OS id = 22862
DBW0 started with pid = 10, OS id = 22866
LGWR started with pid = 11, OS id = 22870
CKPT started with pid = 12, OS id = 22874
SMON: started with pid = 13, OS id = 22878
RECO: started with pid = 14, OS id = 22882

(I think the commonly used to write out, about AWK each parameter did not write, can see DAVE's blog: http://blog.csdn.net/tianlesoftware/article/details/6278273)
######################################## ######################################## ##################
Output formatting, file merging, row/column conversion, etc. Awk provides both print and printf printing functions:
The print function can be a variable, a value, or a string. The string must be referenced in double quotation marks and the parameters must be separated by commas. If there are no commas (,), the parameters are connected together and cannot be distinguished. Here, the comma serves the same purpose as the separator of the output file, except that the latter is a space.
The printf function is similar to the printf function in C language. it can format strings. when the output is complex, printf is easier to use and the code is easier to understand.
Use built-in variables to display the input file name, row number, column number, and specific content of the row-if filename is passed | data transmitted from the pipeline, filename is displayed -
[Oracle @ bys3 ~] $ Awk '{print "filename:" FILENAME ", linenumber:" NR ", columns:" NF ", linecontent:" $0} 'awktest. log
Filename: awktest. log, linenumber: 1, columns: 6, linecontent: MMAN started with pid = 9, OS id = 22862 -- only one row is displayed.
Convert two columns into one row
[Oracle @ bys3 ~] $ Awk '{if ( NR % 2= 0) {print $0} else {printf "% s", $0} 'awktest. log
MMAN started with pid = 9, OS id = 22862 DBW0 started with pid = 10, OS id = 22866
Extract one row per 3 rows:
[Oracle @ bys3 ~] $ Awk '(NR % 3 = 0) {print $0} 'awktest. log
LGWR started with pid = 11, OS id = 22870
RECO: started with pid = 14, OS id = 22882
File merging and splitting
[Oracle @ bys3 ~] $ Cat awktest. log> awkt
[Oracle @ bys3 ~] $ Awk '{print FILENAME, $0}' awktest. log awkt> a. log -- merge awktest. log awkt to a. log
[Oracle @ bys3 ~] $ Cat a. log -- partially truncated
Awktest. log MMAN started with pid = 9, OS id = 22862
Awktest. log DBW0 started with pid = 10, OS id = 22866
Awkt MMAN started with pid = 9, OS id = 22862
Awkt DBW0 started with pid = 10, OS id = 22866
[Oracle @ bys3 ~] $ Rm-rf awkt *
Split the files merged in the previous step into the first two files. Generate a new file name according to the first column of a. log.
[Oracle @ bys3 ~] $ Awk '$1! = Fd {close (fd); fd =1 1} {print substr ($0, index ($0, "") + 1) >1 1} 'A. log
[Oracle @ bys3 ~] $ Cat awkt
MMAN started with pid = 9, OS id = 22862
DBW0 started with pid = 10, OS id = 22866
LGWR started with pid = 11, OS id = 22870
CKPT started with pid = 12, OS id = 22874
SMON: started with pid = 13, OS id = 22878
RECO: started with pid = 14, OS id = 22882
[Oracle @ bys3 ~] $ Cat awktest. log -- same content as cat awkt
######################################## ######################################## ##################
Example of separator: by default, it is separated by spaces or tabs.[Oracle @ bys3 ~] $ Cat awktest. log | awk '{print $5}' -- use the default delimiter
OS
OS
OS
OS
Id = 22878
Id = 22882
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''{print $4" ### "$5" \ t "$9} '-- use four separators at the same time :, =: Space. the fields 4th, 5, and 9 are displayed. The fields are separated by the specified symbols.
Pid #@# 9 22862
Pid #@# 10 22866
Pid ## 11 22870
Pid #@# 12 22874
Pid #@# 13 22878
Pid #@# 14 22882
######################################## ######################################## ##################

Use 'beginendBEGIN: specifies the action that occurs before the first input record is processed. you can set global variables here.
END: the action that occurs after the last input record is read.
The awk workflow is as follows: first execute BEGING, then read the file, read a record with/n line breaks, then divide the record into domains based on the specified domain separator, and fill in the domain, $0 indicates all domains, $1 indicates the first domain, $ n indicates the nth domain, and then starts the action corresponding to the execution mode. Then read the second record until all the records are read and the END operation is executed.
Here, when BEGIN: END: is used, only one vertex character is displayed.
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''In in {print "header-a ### bb ospid"} {print $4 "###" $5 "\ t" $9} END {print "Hello everone, my name is leifeng! "}'
Header-a ### bb ospid
Pid #@# 9 22862
Pid #@# 10 22866
Pid ## 11 22870
Pid #@# 12 22874
Pid #@# 13 22878
Pid #@# 14 22882
Hello everone, my name is leifeng!
######################################## ######################################## ##################
Filter and display the required rows:First, use '/MMAN/' to filter out the rows containing MMAN, and then input another AWK for running. \ n line feed, \ t is equivalent to TAB
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''/MMAN/'| awk-F' [, =:] ''{print $9" \ n "$1" \ t "$0 }'
22862
MMAN started with pid = 9, OS id = 22862
It can be simplified:
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''/MMAN/{print $9" \ n "$1" \ t "$0} '-- MMAN is displayed in the display row.
22862
MMAN started with pid = 9, OS id = 22862
Display rows starting with LGWR to rows starting with CKPT-- If the rows starting with CKPT have rows starting with LGWR, the rows starting with the next CKPT are displayed. if the rows starting with the next CKPT do not exist, the rows starting with the next CKPT are displayed at the end of the file.
[Oracle @ bys3 ~] $ Cat awktest. log | awk '/^ LGWR/,/^ CKPT /'
LGWR started with pid = 11, OS id = 22870
CKPT started with pid = 12, OS id = 22874
[Oracle @ bys3 ~] $ Cat awktest. log | awk '/^ LGWR/,/^ CKPq/' -- the row starting with CKPq is always displayed at the end of the file
LGWR started with pid = 11, OS id = 22870
CKPT started with pid = 12, OS id = 22874
SMON: started with pid = 13, OS id = 22878
RECO: started with pid = 14, OS id = 22882
######################################## ######################################## ##################
Comparison:The addition, subtraction, and division operations can be performed for values greater than or equal to [oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''{print $5 }'
9
10
11
12
13
14
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5> 12 {print $5 }'
13
14
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5 = 10 {print $5" # \ t #"$0 }'
10 ## DBW0 started with pid = 10, OS id = 22866
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5 <10 {print $5" \ t "$0 }'
9 MMAN started with pid = 9, OS id = 22862
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5 <10 {print $5*9"\ T" $0 }'--- $5*9Displayed as 9*9 -- 81
81 MMAN started with pid = 9, OS id = 22862
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5 <10 {print $5/9" \ t "$0 }' --- $5/9 9/9 -- 1
1 MMAN started with pid = 9, OS id = 22862
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5 <10 {print $5" \ t "$9 }'
9 22862
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:]'' $5*$9 <220000{Print $5*$9"###" $5 "\ t" $9 }' -- $5*$9 <220000 show $5*$9 rows smaller than, truncate this row $5 $9
205758 ### 9 22862
######################################## ######################################## ##################
Example of partially matching characters using regular expressions: Search for rows starting with MM
[Oracle @ bys3 ~] $ Cat awktest. log | awk '/^ MM /'
MMAN started with pid = 9, OS id = 22862
Search for rows starting with MM, D, or L.
[Oracle @ bys3 ~] $ Cat awktest. log | awk '/^ (MM | D | L )/'
MMAN started with pid = 9, OS id = 22862
DBW0 started with pid = 10, OS id = 22866
LGWR started with pid = 11, OS id = 22870
The search switch is a row with the letter m d l.
[Oracle @ bys3 ~] $ Cat awktest. log | awk '/^ [MDL]/'
MMAN started with pid = 9, OS id = 22862
DBW0 started with pid = 10, OS id = 22866
LGWR started with pid = 11, OS id = 22870
The last two digits in the specified domain are numbers, and the numbers are 0-9 and 0-2 respectively ~ In this example, calculate from the end
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5 ~ /[0-9] [0-2] $/{print $5 }'
10
11
12
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5 ~ /[3-4] $/{print $5 }' -- Search for 3-4 ending letters
13
14
######################################## ######################################## ##################
Regular logical operators: greater than or equal to not equal to sum &&Or | operation
[Oracle @ bys3 ~] $ Cat awktest. log
MMAN started with pid = 9, OS id = 22862
DBW0 started with pid = 10, OS id = 22866
LGWR started with pid = 11, OS id = 22870
CKPT started with pid = 12, OS id = 22874
SMON: started with pid = 13, OS id = 22878
RECO: started with pid = 14, OS id = 22882
Show $5 = 10 | $9> 22880 $5 equals 10 or $9> 22880 rows
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5 = 10 |$9> 22880 {print $5 "\ t" $9 }'
10 22866
14 22882
Show $5> 10 & $9> 22880 $5 rows larger than 10 and $9> 22880
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5> 10 & $9> 22880 {print $5" \ t "$9 }'
14 22882
Show $5! = 10 $5! = 11 rows not equal to 10 and not equal to 11
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''$5! = 10 & $5! = 11 {print $5 "\ t" $9 }'
9 22862
12 22874
13 22878
14 22882
If print ($5> 12? If $5> 12 is true, the value before the colon is displayed. if not, the value after the colon is displayed.
[Oracle @ bys3 ~] $ Cat awktest. log | awk-F' [, =:] ''{print ($5> 12? "OK \ t" $5: "error \ t" $5 )}'
Error 9
Error 10
Error 11
Error 12
OK 13
OK 14

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.