Use AWK to process Apache logs
Introduction to Awk:
Born in 1970 to Bell Labs, AWK is a programming language for processing text data, named after the first letter of 3 authors (Alfred Aho, Peter Weinberger, and Brian Kernighan), gawk for GNU awk.
Chang:
Number of ARGC command line arguments
ARGV command line variable array
FileName current input filename
FNR the record number in the current file
FS input field separator, default to a space
RS Input Record Separator
NF number of fields in current record
NR record number so far
OFS Output Field Separator
ORS Output Record Separator
Usually the Apache log format looks like this:
' 127.0.0.1--[02/sep/2009:17:12:03 +0800] ' get/index.php?a1=100&a2=good http/1.1 ' 200 '
Assuming that the above log exists in the file My-access_log, we use the following command to parse the log:
#awk ' {printf '%s/n%s/n%s/n%s/n%s/n%s/n%s/n%s/n%s/n ', $1,$2,$3,$4,$5,$6,$7,$8,$9} ' My-access_log
127.0.0.1
-
-
[02/sep/2009:17:12:03
+0800]
"Get
/index.php?a1=100&a2=good
http/1.1 "
200
As you can see from the results, awk is separated into several fields with a Fu Tang line of text, 127.0.0.1 is the first field, and status code 200 is the last
Example:
1, calculate the number of A2=good log
#awk ' {$ ~/a2=good/} ' My-access_log | Wc-l
2, with operations, A2, etc. good && request time is greater than 17:00:00
#awk ' {if ($ ~/a2=good/) && $4> ' [02/sep/2009:17:00:00 ') print $ ' | wc-l
3, invoke the awk command file
#awk-F Commond.awk
The conditional and relational operators are similar to the C language
and or non (&&,| |,!)
Greater than, less than, equal to, not equal to (>,<,==,!=)
Regular matching characters
Match (~)
Mismatch (!~)
To be continued, the primary study is for reference only.