Python statistics client IP traffic based on Nginx access log

Source: Internet
Author: User
Professional statistics website, such as Baidu Statistics, Google ANALYTICS,CNZZ and other statistical background to provide the webmaster commonly used statistical indicators, such as UV,PV, online time, IP, etc., in addition, because of network reasons, I found that Google Analytics will be more than Baidu statistics more than hundreds of of the IP, so want to write their own feet to understand the actual number of visits, but the access log based on Nginx more than the statistical background, because a lot of spider's visit will be counted in, there are static file statistics, In fact, if the algorithm improvement can completely filter out the useless statistics, today to the cattle and cattle to share the most basic statistics, but also to learn and review the Python language.

For example, the server has nginx log as follows:

221.221.155.54--[02/aug/2014:15:16:11 +0800] "get/http/1.1" 8482 "http://www.zuidaima.com/" mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/31.0.1650.57 safari/537.36 ""-"" 0.020 "
221.221.155.53--[02/aug/2014:15:16:11 +0800] "get/http/1.1" 8482 "http://www.zuidaima.com/" mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/31.0.1650.57 safari/537.36 ""-"" 0.020 "
221.221.155.54--[02/aug/2014:15:16:11 +0800] "get/http/1.1" 8482 "http://www.zuidaima.com/" mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/31.0.1650.57 safari/537.36 ""-"" 0.020 "

The statistics script is as follows:

stat_ip.py

#encoding =utf8
Import re
Zuidaima_nginx_log_path= "/usr/local/nginx/logs/www.zuidaima.com.access.log"
Pattern = Re.compile (R ' ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} ')
def stat_ip_views (Log_path):
ret={}
f = open (Log_path, "R")
For line in F:
Match = Pattern.match (line)
If match:
Ip=match.group (0)
If IP in RET:
VIEWS=RET[IP]
Else
Views=0
Views=views+1
Ret[ip]=views
return ret
def run ():
Ip_views=stat_ip_views (Zuidaima_nginx_log_path)
max_ip_view={}
For IP in Ip_views:
VIEWS=IP_VIEWS[IP]
If Len (Max_ip_view) ==0:
Max_ip_view[ip]=views
Else
_ip=max_ip_view.keys () [0]
_VIEWS=MAX_IP_VIEW[_IP]
If Views>_views:
Max_ip_view[ip]=views
Max_ip_view.pop (_IP)
Print "IP:", IP, ", Views:", views
#总共有多少ip
Print "Total:", Len (ip_views)
#最大访问的ip
Print "Max_ip_view:", Max_ip_view
Run ()

The results of the operation are as follows:

ip:221.221.155.53, Views:1
ip:221.221.155.54, Views:2
Total:2
Max_ip_view: {' 221.221.155.54 ': 2}

This gives access to all IP traffic and its maximum IP.

The above describes the python based on the Nginx access log statistics client IP access, including aspects of the content, I hope the PHP tutorial interested in a friend helpful.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.