awk 19
In general, traffic bandwidth is through the SNMP protocol to draw network card traffic. However, in some cases, to optimize the analysis or troubleshooting, it will be directly to calculate the access traffic to the service. The method is very simple, according to the log of the request time (Squid record is the request response time, if you want to be accurate, you can subtract the response time, but the general squid file is not in 5 minutes will not be finished ... ), summarize its bytes every 5 minutes, and then divide it to 300 seconds.
The command line that calculates the maximum bandwidth in a full log is as follows:
Cat $ACCESS _log|awk-f ' [:] ' {a[$5 ': "$6]+=$14}end{for (i in a) {print i,a[i]}} ' |sort|awk ' {a+=$2;if (nr%5==0) {if (a>b) {B=a;c=$1};a=0}}end{print c,b*8/300/1024/1024} '
(log as standard Apache log format) and change the last awk to ' {a+=$2;if (nr%5==0) {print $1,a*8/300/1024/1024;a=0} ', you can output the traffic value every 5 minutes and then use the GD library to draw ~ ~ (There's time to see Perl's Gd:graph module, it should be easy)
Case: Use awk and SORT-NR to analyze the access logs to find out the number of requests for each request to return a status code:
File format for access log:
113.31.27.213 www.5iops.com–[15/apr/2012:04:06:17 +0800] "get/faq/http/1.0″200 2795"-"" mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/535.11 (khtml, like Gecko) chrome/17.0.963.2 safari/535.11″118.119.120.248, 222.215.136.44 0.003 192 .168.0.25:80 200 3.31
Our most commonly used approach is: Cat/home/logs/nginx/www.5iops.com.access.log|awk ' {print $ (NF-1)} ' |sort-nr |uniq-c, but in fact, this processing is inefficient:
Time Cat/home/logs/nginx/www.5iops.com.access.log|awk ' {print $ (NF-1)} ' |sort-nr |uniq-c
1 200″
3 "404"
4 "304"
7377 "200"
48 "-"
Real 0m0.107s
User 0m0.102s
SYS 0m0.013s
Time Cat/home/logs/nginx/www.5iops.com.access.log|awk ' {a[$ (NF-1)]++} end{for (i in a) print I "" a[i]} '
"304" 4
"404" 3
"200" 7399
"-" 49
200″1
Real 0m0.018s
User 0m0.013s
SYS 0m0.008s
Visible 10 times times more efficient using the subsequent awk processing ~
Another case, the following text analysis tasks are handled with a single line of shell commands:
There is a text file that contains the peak of bandwidth recorded for every 5 minutes of each behavior, for a total of one months (a total of 8640 lines), I need to calculate the peak daily and sort:
Traffic file format:
-bash-4.1$ Cat Traffic.txt|more
2012-04-01 00:00 1952.34mbps
2012-04-01 00:05 2198.34 Mbps
2012-04-01 00:10 2117.07 Mbps
2012-04-01 00:15 2104.83 Mbps
2012-04-01 00:20 1878.73 Mbps
...
A common workaround:
-bash-4.1$ for I in Cat Traffic.txt|awk ' {print \ |sort|uniq} '; Do cat traffic.txt|grep $i |sort-nr-k3 |head-1; Done
2012-04-01 21:35 3876.02 Mbps
2012-04-02 21:15 3577.66 Mbps
2012-04-03 21:35 3371.59 Mbps
2012-04-04 21:10 3087.17 Mbps
2012-04-05 21:35 3202.44 Mbps
2012-04-06 20:45 3703.53 Mbps
2012-04-07 20:40 4177.43 Mbps
2012-04-08 14:25 3837.9 Mbps
2012-04-09 20:50 3082.46 Mbps
...
A more efficient solution:
-bash-4.1$ Cat Traffic.txt |awk ‘ {if ($ > A[$1]) a[$1]=$3} End{for (i in a) print I "" a[i]} '
2012-04-28 on 5369.81
2012-04-19 on 3474.73
2012-04-29 on 4824.24
2012-04-10 on 2979.91
2012-04-01 on 3876.02
2012-04-20 on 3866.19
2012-04-11 on 3548.73
2012-04-02 on 3577.66
2012-04-30 on 4077.35
...
By contrast, the latter approach is still 10 times times more efficient to process. Visible usage awk significantly improves the processing efficiency of text or log analysis.
- This article is from: Linux Tutorial Network
Calculate bandwidth traffic spikes using log logs