Nginx log analysis shell script

Source: Internet
Author: User
Tags grep regular expression http 200 regular expression

A few commands

The script is based on the preceding log format. If your log format is different, you need to adjust the parameters following the awk.

Analyze the UserAgent in the log

The code is as follows: Copy code
Cat access_20130704.log | awk-F "'{print $ (NF-3)}' | sort | uniq-c | sort-nr | head-20

The above script will analyze up to 20 useragents in the log file

The IP addresses in the analysis log have the most access

The code is as follows: Copy code
Cat access_20130704.log | awk '{print $1}' | sort | uniq-c | sort-nr | head-20

Analyze
The maximum number of Url request visits

The code is as follows: Copy code

Cat access_20130704.log | awk-F "'{print $ (NF-5)}' | sort | uniq-c | sort-nr | head-20

Next, let's go to the topic. The power of grep is embodied in N many aspects. Here, we use the grep regular expression to analyze nginx logs. To make it easier to use it multiple times, write scripts. Temporarily named nla. sh

You can modify the grep items as needed to get the desired results.

The code is as follows: Copy code

#! /Bin/bash

######################################## #########
#
# This is a default nginx log analysis script
# Mainly use grep for work

# Considering that many people prefer to split logs by date and gz
# Added a simple gz format judgment
# The file will be restored after the log analysis is complete
# Ccshaowei # gmail.com
#2012/05/13
# Http://shaowei.info/
######################################## ########

# Modify the following line as the log location
Log_dir = 'Access. log -*'
##########
# $ Key is a grep keyword, $ word is a prompt, one-to-one correspondence is required, and the number is the same
##########
# Http code
Key [0] = '" 200 [0-9] {3} '; word [0] = 'http 200'
Key [1] = '" 206 [0-9] {3} '; word [1] = 'http 206'
Key [2] = '" 404 [0-9] {3} '; word [2] = 'http 404'
Key [3] = '" 503 [0-9] {3} '; word [3] = 'http 503'
##########
# Seo/seo.html "target =" _ blank "> Search engine crawler
Key [4] = 'bot bot. * google.com/bot.html'{word=4}}'{googlecrawlers'
Key [5] = 'baidider Ider. * baidu.com/search/spider.html'?word=5='every hundred-degree Spider'
Key [6] = 'bingbot. * bing.com/bingbot.htm'?word=6='{bingcrawler'
# Soso 'sosospider. * soso.com/webspider.htm'
# YoudaoBot. * youdao.com/help/webmaster/spider /'
# Yahoo China 'Yahoo! Slurp China'
##########
# Browser
Key [7] = 'msie'; word [7] = 'msi'
Key [8] = 'Gecko/. * Firefox '; word [8] = 'Firefox'
Key [9] = 'applewebkit. * like Gecko '; word [9] = 'webkit'
Key [10] = 'Opera. * Presto '; word [10] = 'Opera'
#360 secure 'msie. * 360SE 'or the ie kernel version 'msie 6.0. * 360SE ''' MSIE 7.0. * 360SE ''' MSIE 8.0. * 360SE ''' MSIE 9.0. * 360SE'
#360 QPS 'applewebkit. * QIHU 360EE'
##########
# Operating system
Key [11] = 'windows NT 6.1 '; word [11] = 'windows 7'
Key [12] = 'Macintosh; Intel Mac OS X'; word [12] = 'Mac OS X'
Key [13] = 'x11. * Linux '; word [13] = 'Linux with x11'
Key [14] = 'Android; '; word [14] = 'Android'
# Windows series win2000 'Windows NT 5.0 'winxp 'Windows NT 5.1 'winvasta' Windows NT 6.0 'win7 'Windows NT 100'
# SymbianOS 'symbianos'
##########
# Device
Key [15] = 'iPad. * like Mac OS X'; word [15] = 'iPad'
Key [16] = 'Nokia '; word [16] = 'Nokia series'
Key [17] = 'nokia5800 '; word [17] = 'nokia5800 XpressMusic'
# IPhone 'iPhone. * like Mac OS X'
##########
# Others
Key [18] = 'Get/. *. mp3 http'; word [18] = "access mp3 files"
Key [19] = 'Get/. *. jpg http'; word [19] = "access jpg files"

# End of configuration
######################################## ######################################

Log_num =$ (ls $ {log_dir} | wc-l)
Fileid = 0
Isgz = 0
# Gz check
For file in $ (ls $ {log_dir })
Do
If ["$ {file # *.}" = "gz"]; then
Isgz [$ fileid] = 1
Gzip-dvf $ file
Logfile [$ fileid] =$ (echo $ file | sed's/. gz $ //')
(Fileid ++ ))
Else
Isgz [$ fileid] = 0
Logfile [$ fileid] = $ file
(Fileid ++ ))
Fi
Done
# Check whether the number of keys and words is consistent
If [$ {# word [@]}-ne $ {# key [@]}]
Then
Echo "configuration error, the number of keys and word is inconsistent"
Else

Checkid = 0
While [$ checkid-lt $ log_num]
Do
Filename =$ {logfile [$ checkid]}
Totle = $ (cat $ filename | wc-l)
Echo "logs $ {filename} total $ {totle} lines, need to process $ {# key [@]} items"
Echo "number of source IP addresses: $ (cat $ filename | awk '{print $1}' | sort | uniq | wc-l )"
I = 0
While [$ I-lt $ {# key [@]}]
Do
S1 =$ {word [$ I]}
S2 = $ (cat $ filename | grep ''" $ {key [$ I]} "'' | wc-l)
S3 =$ (awk 'In in {printf "%. 2f % n", ('$ s2'/'$ totle') * 100 }')
Echo "$ {s3 }$ {s1 }:: {s2 }"
(I ++ ))
Done
(Checkid ++ ))
Echo "-----------------"
Done
Fi
# Restore a compressed file
Gzid = 0
While [$ gzid-lt $ log_num]
Do
If ["$ {isgz [$ gzid]}" = "1"]
Then
Gzip-v $ {logfile [$ gzid]}
Fi
(Gzid ++ ))
Done

The running result is as follows:

[Root @ hostname temp] # ls-lh
Total usage 299 M
-Rw-r ----- 1 root 11 M May 14 13:25 access.log-20120508.gz
-Rw-r ----- 1 root 158 M May 14 13:25 access. log-20120509
-Rw-r ----- 1 root 2.2 M May 14 13:25 access.log-20120510.gz
-Rw-r ----- 1 root 129 M May 14 13:25 access. log-20120511
-Rwxr-xr-x 1 root 3.4 K May 14 13:10 nla. sh
[Root @ hostname temp] # sh nla. sh
Access.log-20120508.gz: 93.5% -- replaced with access. log-20120508
Access.log-20120510.gz: 93.9% -- replaced with access. log-20120510
Log access. log-20120508 contains 643281 lines and 20 items need to be processed
Number of source IP addresses: 7483
44.52% http 200: 286400
3.55% http 206: 22824
20.23% http 404: 130128
14.31% http 503: 92029
1.94% from Google crawlers: 12491
2.01% from Baidu Spider: 12943
0.90% from Bing crawlers: 5780
76.53% MSIE: 492291
2.21% Firefox: 14209
7.03% Webkit: 45215
0.27% Opera: 1736
25.17% Windows 7: 161935
1.37% Mac OS X: 8830
0.03% Linux with X11: 202
0.03% Android: 190
0.11% iPad: 677
0.50% Nokia Series: 3207
0.02% Nokia5800 XpressMusic: 102
36.06% access mp3 files: 231959
23.10% access jpg files: 148600
-----------------
Log access. log-20120509 contains 608316 lines and 20 items need to be processed
Number of source IP addresses: 7429
45.15% http 200: 274651
1.79% http 206: 10884
15.59% http 404: 94854
19.95% http 503: 121376
2.83% from Google crawlers: 17245
1.80% from Baidu Spider: 10970
0.23% from Bing crawlers: 1410
78.96% MSIE: 480324
1.28% Firefox: 7783
7.85% Webkit: 47774
0.43% Opera: 2597
22.85% Windows 7: 139022
0.63% Mac OS X: 3827
0.06% Linux with X11: 389
0.06% Android: 372
0.06% iPad: 351
0.19% Nokia Series: 1158
0.00% Nokia5800 XpressMusic: 4
34.94% access mp3 files: 212555
23.46% access jpg files: 142702
-----------------
Log access. log-20120510 contains 141224 lines and 20 items need to be processed
Number of source IP addresses: 2040
50.15% http 200: 70823
1.67% http 206: 2354
14.15% http 404: 19987
17.37% http 503: 24534
4.53% from Google crawlers: 6399
2.66% from Baidu Spider: 3754
0.44% from Bing crawlers: 622
69.34% MSIE: 97921
1.19% Firefox: 1682
9.54% Webkit: 13470
0.53% Opera: 742
19.37% Windows 7: 27351
1.23% Mac OS X: 1737
0.03% Linux with X11: 45
0.00% Android: 0
0.09% iPad: 130
0.86% Nokia Series: 1220
0.00% Nokia5800 XpressMusic: 0
30.29% access mp3 files: 42777
23.91% access jpg files: 33768
-----------------
Log access. log-20120511 contains 473259 lines and 20 items need to be processed
Number of source IP addresses: 5093
44.91% http 200: 212551
1.96% http 206: 9286
15.14% http 404: 71671
21.20% http 503: 100322
2.44% from Google crawlers: 11548
1.40% from Baidu Spider: 6616
3.40% from Bing crawlers: 16068
76.75% MSIE: 363224
0.93% Firefox: 4388
6.75% Webkit: 31937
0.31% Opera: 1444
28.62% Windows 7: 135444
0.43% Mac OS X: 2057
0.02% Linux with X11: 116
0.00% Android: 0
0.09% iPad: 419
0.23% Nokia Series: 1094
0.00% Nokia5800 XpressMusic: 0
35.77% access mp3 files: 169274
22.46% access jpg files: 106299
-----------------
Access. log-20120508: 93.5% -- replaced with access.log-20120508.gz
Access. log-20120510: 93.9% -- replaced with access.log-20120510.gz
[Root @ hostname temp] # ls-lh
Total usage 299 M
-Rw-r ----- 1 root 11 M May 14 13:25 access.log-20120508.gz
-Rw-r ----- 1 root 158 M May 14 13:25 access. log-20120509
-Rw-r ----- 1 root 2.2 M May 14 13:25 access.log-20120510.gz
-Rw-r ----- 1 root 129 M May 14 13:25 access. log-20120511
-Rwxr-xr-x 1 root 3.4 K May 14 13:10 nla. sh

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.