Found that there is a large number of IP to our API collection, so write this script to get which IP only access to a single interface, but not access to other interfaces, generally such behavior, is abnormal.
The log format of the front-end load Nginx is analyzed as follows:
114.249.4.96--[15/jan/2016:23:59:47 +0800] "post/api2/realtimetrack/http/1.1" 200 48 "-" "-" "-"
222.128.172.215--[15/jan/2016:23:59:47 +0800] "post/api2/button_log/http/1.1" 200 48 "-" "-" "-"
110.72.182.177--[15/jan/2016:23:59:47 +0800] "post/api2/realtimetrack/http/1.1" 200 48 "-" "-" "-"
58.63.7.92--[15/jan/2016:23:59:48 +0800] "post/api2/getgoodsdetail/http/1.1" 200 877 "-" "-" "-"
117.177.160.218--[15/jan/2016:23:59:48 +0800] "post/api2/realtimetrack/http/1.1" 200 82 "-" "-" "-"
117.177.160.218--[15/jan/2016:23:59:48 +0800] "post/api2/realtimetrack/http/1.1" 200 82 "-" "-" "-"
163.142.55.76--[15/jan/2016:23:59:48 +0800] "post/api2/getuserinfo/http/1.1" 200 546 "-" "-" "-"
114.112.89.34--[15/jan/2016:23:59:48 +0800] "post/api2/getgoodslist/http/1.1" 200 9532 "-" "-" "-"
58.61.225.110--[15/jan/2016:23:59:49 +0800] "post/api2/realtimetrack/http/1.1" 200 82 "-" "-" "-"
114.244.195.163--[15/jan/2016:23:59:49 +0800] "post/api2/getgoodslist/http/1.1" 200 47834 "-" "-" "-"
114.244.195.163--[15/jan/2016:23:59:49 +0800] "post/api2/getgoodslist/http/1.1" 200 47834 "-" "-" "-"
114.112.89.34--[15/jan/2016:23:59:49 +0800] "post/api2/getgoodslist/http/1.1" 200 9532 "-" "-" "-"
125.39.170.239--[15/jan/2016:23:59:49 +0800] "post/api2/realtimetrack/http/1.1" 200 30 "-" "-" "-"
110.84.169.57--[15/jan/2016:23:59:50 +0800] "post/api2/realtimetrack/http/1.1" 200 48 "-" "-" "-"
42.81.46.142--[15/jan/2016:23:59:50 +0800] "post/api2/realtimetrack/http/1.1" 200 48 "-" "-" "-"
110.84.169.57--[15/jan/2016:23:59:50 +0800] "post/api2/realtimetrack/http/1.1" 200 82 "-" "-" "-"
117.136.40.148--[15/jan/2016:23:59:50 +0800] "post/api2/getgoodslist/http/1.1" 200 1024 "-" "-" "-"
117.12.243.251--[15/jan/2016:23:59:50 +0800] "post/api2/realtimetrack/http/1.1" 200 48 "-" "-" "-"
117.12.243.251--[15/jan/2016:23:59:50 +0800] "post/api2/realtimetrack/http/1.1" 200 82 "-" "-" "-"
Python Profiling code:
#!/usr/bin/env python
#coding: UTF8
__author__ = ' Dairu '
"""
Detection of nginx log access IP is there a program to crawl interface information
Rule: Program analysis accesses only the Getgoodslist interface without accessing other interface IP
"""
Log_path = '/home/logs/nginx/access.log '
#定义IP访问每个URL的次数的空字典, such as {' 10.0.0.1 ': {'/api2/getgoodslist ': 15}}}
Ip_info = {}
With open (Log_path, ' R ') as F:
For line in F.readlines ():
#获取IP地址
ip = line.split () [0]
#获取访问接口URL
url = line.split () [6]
#如果字典里没有该IP, add the IP to the key value, the URL is a level two dictionary key, and the number of accesses = 1
#如果有该IP但在二级字典中没有该URL, the URL is set to a level two dictionary key with a access number of 1
#如果有该IP, and the URL is in the Level two dictionary, the value of the URL is +1
If IP not in Ip_info:
Ip_info[ip] = {Url:1}
Else
If URL not in Ip_info[ip]:
Ip_info[ip][url] = 1
Else
Ip_info[ip][url] + + 1
#遍历结果, the IP only accesses less than 3 interfaces, and accesses the Getgoodslist interface more than 100 times to print out
For Ip,value in Ip_info.items ():
If Len (value) < 3 and Value.get ('/api2/getgoodslist/', 0) > 100:
Print "ip:%s url-count:%s"% (ip,value)
Analysis results:
ip:58.63.7.92 url-count:{'/api2/getgoodsdetail/': 3383, '/api2/getgoodslist/': 550}
ip:58.63.4.71 url-count:{'/api2/getgoodsdetail/': 4499, '/api2/getgoodslist/': 275}
ip:118.122.120.146 url-count:{'/api2/getgoodslist/': 443}
ip:114.244.195.163 url-count:{'/api2/getgoodslist/': 568}
ip:124.72.23.174 url-count:{'/api2/getgoodslist/': 132}
ip:183.30.79.59 url-count:{'/api2/getgoodslist/': 322, '/api2/realtimetrack/': 6}
ip:61.140.50.120 url-count:{'/api2/getgoodslist/': 1402}
ip:171.221.25.108 url-count:{'/api2/getgoodslist/': 1136}