Awk processes multiple files

Source: Internet
Author: User

Awk processes data input from multiple files. awk has two sources: standard input and file. The latter method supports multiple files. Www.2cto.com For example: 1. shell Pathname Expansion mode: awk '{...} '*. txt #*. txt is first interpreted by shell and replaced with all * in the current directory *. txt, # If 1.txtand 2.txt are included in the current directory, the command is awk '{...} '1.txt 2.txt 2. specify multiple files directly: awk '{...} 'a.txt B .txt c.txt... # awk's processing process for multiple files is to read the content of each file. For example, read a.txt first and then B .txt .... so, when processing multiple files, how can we determine which file awk is currently reading and perform corresponding operations in turn? Www.2cto.com ######################## process 2 files ########## ############# when the awk only reads two files, there are two common methods: (1) awk 'nr = FNR {...} NR> FNR {...} 'file1 file2 or awk' NR = FNR {...} NR! = FNR {...} 'file1 file2 (2) Another type is awk' NR = FNR {...; next }{...} 'file1 file2 when there are only two files read by awk, there are two common methods: (1) awk 'nr = FNR {...} NR> FNR {...} 'file1 file2 or awk' NR = FNR {...} NR! = FNR {...} 'file1 file2 (2) Another type is awk' NR = FNR {...; next }{...} 'file1 file2 understand The meaning of the two awk built-in variables FNR and NR, and it is easy to know how these two methods work. FNR The input record number in the current input file. # number of records read into The current file NR The total number of input records seen so far. # Total number of read records next Stop processing the current input record. the next input record is read and processing starts over with the first pattern in the AWK program. if the end The input data is reached, the END block (s), if any, are executed. awk 'nr = FNR {...} NR> FNR {...} 'file1 file2 # When reading file1, the number of records read into file1 FNR must be equal to the total number of records read into awk NR, because file1 is the first file read by awk, when reading file1, execute the previous command block {...} # When reading file2, the total number of records read into NR must be greater than the number of records read into file2 FNR, so when reading file2, execute the next command block {...} awk 'nr = FNR {...; next }{...} 'file1 file2 # When reading file1, meet NR = FNR. First run the previous command block, but because of the next command, the next command block {...} is not executed # When reading file2, it does not meet NR = FNR, the previous command block {..} no execution, only execution Next command block {...} ######################### process multiple files ########### ############ when the awk processes more than two files, obviously, the method above does not apply. Because when reading 3rd files or more, NR> FNR (NR! = FNR), obviously cannot be distinguished, so we need to use a more general method: 1. ARGIND # mark of the currently processed parameter awk 'argind = 1 {...} ARGIND = 2 {...} ARGIND = 3 {...}... 'file1 file2 file3... 2. ARGV # command line parameter array awk 'filename = ARGV [1] {...} FILENAME = ARGV [2] {...} FILENAME = ARGV [3] {...}... 'file1 file2 file3... 3. add the file name directly to determine awk 'FILENAME = "file1 "{...} FILENAME = "file2 "{...} FILENAME = "file3 "{...}... 'file1 file2 file3... ########################## Example 1 ############# ### ####### Existing file1 and file2 files. The file file1 has two columns, such as no1 name1no2 name2no3 name2no4 name3no5 name4no6 name4no7 name4no8 name5no9 name6no10 name6 file file2 has six columns, some of which have spaces, the content is as follows: name1 data1 dada2 data3 data4 dada5name2 dada6 data7 dada8name3 data9 dada10 data11 dada12name4 data13 dada14name5 data15 dada16name6 data17 data18 If Column 2nd of file1 matches column 1st of file2, the two data items are merged into one, the merged data is as follows: no1 name1 data1 dada2 data3 data4 region name2 dada6 data7 region name2 dada6 data7 region name3 data9 dada10 data11 region name4 data13 region name4 data13 region name5 data15 region name6 data17 data18no10 name6 data17 data18 program: awk 'nr = FNR {a [$1] = $0} NR> FNR {print $1 "" a [$2]} 'file2 file1 ###### ################### Example 2 ################### ##### file1: sina.com has been 35 file 2: www.news.sina.com sina.com has baidu.com has sohu.com has sina.com has baidu.com has sina.com has sohu.com 20 Merged Results: www.news.sina.com sina.com 80 baibaidu.com 20 baisohu.com 50 baisina.com 60 baisohu.com 70 baisohu.com 30 baisina.com 10 baibaidu.com 50 baisina.com 60 baisohu.com 20 42.5 program: awk 'nr = FNR {a [$1] = $2; next} {print $0, a [$2]} 'file1 file2

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.