Filter invalid ip addresses and robot in access logs
Source: Internet
Author: User
Filter the invalid ip addresses in the access log and the script for regularly updating the ip addresses of the robot: www.2cto.com #! /Bin/sh # update the company IP address on a regular basis to filter # author: FelixZhang # date: 2012-12-29filedir/opt/logdata/companyipadate $ (date-d & quot; today & quot ;... filter the invalid ip addresses in the access log and the script for regularly updating the ip addresses of the robot: www.2cto.com #! /Bin/sh # update the company IP address on a regular basis to filter # author: Felix Zhang # date: 2012-12-29 filedir =/opt/logdata/companyip adate = $ (date-d "today" + "% Y % m % d") filename = "$ {filedir}/ip. $ {adate} "ip = '/usr/bin/host yourcompany.3322.org | awk' {print $4} 'if [''! = "'Grep $ ip $ {filename} '"]; then exit 0fi echo "$ ip" >$ {filename} # Set how long you want to savesave_days = 30 # delete 30 days ago nginx log filesfind $ {filedir}-mtime + $ {save_days}-exec rm-rf {}\; log analysis script: www.2cto.com #! /Bin/shipdir =/opt/logdata/companyipadate = $ (date-d "today" + "% Y % m % d") ipfile = "$ {ipdir}/ip. $ {adate} "ipreg =" 127.0.0.1 "if [-e $ {ipfile}]; then ipreg = 'cat $ {ipfile} | sed ': a N; s/\ n/|/; ta ''echo "1" fiif ["$ {ipreg}" = ""]; then ipreg = "127.0.0.1" echo "2" fiecho $ {ipreg} # cat ip. test | grep-E-v '2017. 0.0.1 | 126.23.23.44 'fileName = $1; echo 'Analysis file '$ fileName cat $ fileName | egrep-v $ {ipreg} | awk' {print $7} '. in this way, you can filter out your company's IP addresses when analyzing logs.. of course, robots can also be filtered out based on their characteristics, here only a few bots are provided: cat $ {logfile} | grep-E-v $ {ipreg} | grep-E-v "DNSPod-monitor | bot.htm | spider.htm | webmasters.htm" >$ {cleanlogfile}
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.
A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service