Python writes a squid to access the log analysis applet, pythonsquid
In the past two weeks, several people in the group want to learn python, so we have created such an environment and atmosphere for everyone to learn.
Yesterday, I posted a requirement in the group to count and sort the number of ip segments and the number of URLs in the squid access log. Many of you have implemented the corresponding functions, I will post my simple implementation. Welcome to make a brick:
The log format is as follows:
Copy codeThe Code is as follows:
% Ts. % 03tu % 6tr % {X-Forwarded-For}> h % Ss/% 03Hs % <st % rm % ru % un % Sh/% <A % mt "% {Referer }> h "" % {User-Agent}> h "% {Cookie}> h
Copy codeThe Code is as follows:
1372776321.285 0 100.64.19.225 TCP_HIT/200 8560 GET http://img1.jb51.net/games/0908/19/1549401_3_80x100.jpg-NONE/-image/jpeg "http://www.bkjia.com/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; QQDownload 734 ;. NET4.0C ;. net clr 2.0.50727) "pcsuv = 0; % 20 pcuvdata = lastAccessTime = 1372776317582; % 20u4ad = 33480hn; % 20c = 14 arynt; % 20 uf = 1372776310453
Copy codeThe Code is as follows:
#! /Usr/bin/python
#-*-Coding: UTF-8 -*-
Import sys
From optparse import OptionParser
'''
It is just a test of log files, where the number of access. log ip addresses is counted
'''
Try:
F = open ('/data/proclog/log/squid/access. log ')
Handle t IOError, e:
Print "can't open the file: % s" % (e)
Def log_report (field ):
'''
Return the field of the access log
'''
If field = "ip ":
Return [line. split () [2] for line in f]
If field = "url ":
Return [line. split () [6] for line in f]
Def log_count (field ):
'''
Return a dict of like {field: number}
'''
Fields2 = {}
Fields = log_report (field)
For field_tmp in fields:
If field_tmp in fields2:
Fields2 [field_tmp] + = 1
Else:
Fields2 [field_tmp] = 1
Return fields2
Def log_sort (field, number = 10, reverse = True ):
'''
Print the sorted fields to output
'''
For v in sorted (log_count (field). iteritems (), key = lambda x: x [1], reverse = reverse) [0: int (number)]:
Print v [1], v [0]
If _ name _ = "_ main __":
Parser = OptionParser (usage = "% prog [-I |-u] [-n num |-r]", version = "1.0 ")
Parser. add_option ('-n',' -- number', dest = "number", type = int, default = 10, help = "print top line of the ouput ")
Parser. add_option ('-I', '-- ip', dest = "ip", action = "store_true", help = "print ip information of access log ")
Parser. add_option ('-U',' -- url', dest = "url", action = "store_true", help = "print url information of access log ")
Parser. add_option ('-R',' -- reverse ', action = "store_true", dest = "reverse", help = "reverse output ")
(Options, args) = parser. parse_args ()
If len (sys. argv) <2:
Parser. print_help ()
If options. ip and options. url:
Parser. error ('-I and-u can not be execute at the same Time ')
If options. ip:
Log_sort ("ip", options. number, True and options. reverse or False)
If options. url:
Log_sort ("url", options. number, True and options. reverse or False)
F. close ()
The effect is as follows:
The SQUID agent log in LINUX is too large.
You can disable log writing in the squid configuration file, for example
Cache_access_log/squid/logs/access. log
Change to cache_access_log none.
Squid does not generate access logs.
If you do not disable squid. conf, squid will write a large number of log files. You must periodically scroll log files to prevent them from becoming too large. Squid writes a large amount of important information into the log. If it cannot be written, squid will have an error and exit.
Run the following command:
% Squid-k rotate
To scroll log records.
For example, the following task interfaces scroll logs at every day:
0 4 ***/usr/local/squid/sbin/squid-k rotate
This command does two things. First, it closes the currently opened log file. Then, add a digital extension after the file name, and rename cache. log, store. log, and access. log. For example, cache. log is changed to cache. log.0, and cache. log.0 is changed to cache. log.1.
Crontab is a scheduled process in Linux. It automatically runs programs according to the write time. There is a lot of information on the Internet to check its usage.
For squid, we recommend that you refer to home.arcor.de/pangj/squid/
Squid, an authoritative Chinese guide, has been mentioned in most things.
I want to write a log analysis tool in java to retrieve the content we need in the log according to a certain rule.
Give you a thought
When writing a thread to read logs that comply with the exception rules, output the log records to the page