[Copyright statement: reprinted. Please retain the Source: blog.csdn.net/gentleliu. Mail: shallnew at 163 dot com]
Awk has many built-in variables used to set environment information. These variables can be changed. The following are the built-in variables of awk:
Number of argc command line parameters
Argv command line parameter arrangement
Environ supports the use of system environment variables in the queue
Filename a w k browsed file name
Number of FNR browsing file records
FS sets the input domain separator, which is equivalent to the command line-F Option
Number of NF browsing records
Number of records read by NR
OFS output domain Separator
ORS output record Separator
RS control record delimiter
1. built-in Variables
NF supports the number of record fields, which are set after the record is read, that is, the number of fields that are separated by the set delimiter for each row to be read. It is also called the "field quantity" variable. Awk automatically sets this variable to the number of fields in the current row.
Nr: view the row number currently being operated.
Filename records the name of the file currently being operated.
Now we can retrieve the row number of the file group_file1 and the number of fields in each row.
awk '{print NR, NF}' group_file1 1 32 33 34 45 3
Now we pass in two awk files to remove the file name and NR and NF information of the current operation:
# awk '{print FILENAME,NR, NF}' group_file1 group_file2group_file1 1 3group_file1 2 3group_file1 3 3group_file1 4 4group_file1 5 3group_file2 6 1group_file2 7 1group_file2 8 1group_file2 9 1group_file2 10 1
It can be seen that nr is not the row number of the current row in the file, but the number of rows operated by the entire list of operated files. If the row is not processed, this value will be added.
With NF and NR, our operations become more flexible. We can add condition judgment and regular expression. The following print the row with the number of fields being 3 and the third field matching 98:
# awk '{if (NF == 3 && $3~/98/) print}' group_file1wireshark x 987usbmon x 986jackuser x 985
We can pass in the current absolute path and analyze the current directory name:
# pwd/home/Myprojects/shell_text_filter/awk# pwd | awk -F"/" '{print $NF }'awk
Similarly, the absolute path of the input file can also be analyzed.
OFS can print OFS variables between printed single fields. We can easily redefine OFS so that awk will insert our favorite field separator. OFS is not set. The default value is null.
# awk 'BEGIN{OFS=" - "}{print $1, $3}' group_file1 wireshark - 987usbmon - 986jackuser - 985vboxusers - 984aln – 1001
ORS outputs the record separator. By setting the OFS with the default line feed ("\ n"), we can control the characters automatically printed at the end of the print statement, the default ors value causes awk to output each new print statement in the new line. To double the output interval, set ORS to \ n ". Or, if you want to separate records with a single space (without line breaks), set ORS "". For example, the following two files:
# cat 11 2 3 4 5# cat 26 7 8 9 0
We want to connect the content to a line, which can be used as follows:
# awk 'BEGIN{ORS=" "}{print}' 1 21 2 3 4 5 6 7 8 9 0
The following links each line of the group_file1 file with the "-" symbol ,:
# awk 'BEGIN{ORS=" - "}{print}END{ORS="\n";print "\n"}' group_file1wireshark x 987 - usbmon x 986 - jackuser x 985 - vboxusers x 984 allen - aln x 1001 -
The RS variable controls the record delimiter. The RS variable tells the awk when the current record ends and when the new record starts. We can divide a row of records into multiple rows for analysis by setting the RS variable. For example, if we pass the preceding awk command execution result to the new awk, we can split it into multiple lines by setting RS for analysis:
# awk 'BEGIN{ORS=" - "}{print}END{ORS="\n";print "\n"}' group_file1 | awk 'BEGIN{RS=" - "}{print $1}'wiresharkusbmonjackuservboxusersaln
2. General Variables
In awk, setting meaningful domain names is a good habit. The common method for setting variable names is Var = $ n. Here, VaR is the name of the called domain variable, and N is the actual domain number.
# awk '{name=$1;id=$3;if (id==985)print name}' group_file1jackuser#
It is usually helpful to assign values in the begin part, which can reduce a lot of trouble when changing the awk expression. For example, we want to print the user group names with IDs greater than 1000:
# awk 'BEGIN{id=1000}{if($3>id)print $1}' group_file1 Aln
Awk sets the data field value and outputs it to modify the file data field. When any data field is modified in awk, the actual input file cannot be modified, only the awk copies saved in the cache are modified. If we modify the original file, awk needs to output the modified content to the file and rename it as the original file.
# awk '{if($1=="aln")$3=$3-1; print $1,$2,$3}' group_file1 wireshark x 987usbmon x 986jackuser x 985vboxusers x 984aln x 1000
Similarly, adding, decreasing, and modifying fields can all perform the same operation.
It is convenient for awk to calculate the value of each row:
# awk 'BEGIN{total=0}{total+=$3}END{print "total="total}' group_file1 total=4943
Through the above example, we can write a script to calculate the sum of the sizes of all the files in the current file. Below is a script like this:
#! /Bin/shls-Al | awk 'in in {Total = 0} {If (/^ [^ d]/) {# No statistics directory total + = $5 print $1 "\ t" $5 }}end {print "total size:" Total }'
Shell text filtering programming (4): awk built-in variables and general Variables