How to analyze access logs using Python _node.js

Source: Internet
Author: User

Objective

WAF on-line, the most processed is false alarm elimination.

There are a number of reasons for false positives, such as allowing the client to submit too many cookies when the Web application is written, such as the value of a single parameter submission is too large.

Reduce false positives to an acceptable range, and pay attention to false negatives. WAF is not a god, any WAF may be bypassed. So also need to locate missed attacks, clear the reason for the omission, can update WAF strategy.

To locate an omission, you must parse the access log for the Web application. A site, the access log generated every day is approximately 1GB, obviously by the naked eye is unrealistic. This requires the use of Python to help automate analysis.

Realize the idea

Take one of our web systems for example:

Apache turned on access logging

The log rule is to generate a log file per hour, with the site name as the filename and the date + time as the suffix. For example:special.XXXXXX.com.cn.2016101001

To analyze these scattered log files, my thoughts are as follows:

1, according to user command line input to get log file directory;

2, traverse the directory of all the files, merged into a file;

3. Define a common payload string for Web attacks:

SQLI: Select, Union, +–+;

Struts: OGNL, Java

Webshell Common: Base64, eval, Excute

Use regular line-by-row matching to copy the hit log to a separate file.

Implementation code

The code is as follows:

#-*-coding:utf-8-*-Import Os,re,sys If Len (sys.argv)!= 2:print ' Usage:python logaudit.py <path> ' SYS.E
  XIT () LogPath = sys.argv[1] #获取输入参数的文件路径 ' merge = Re.compile (R '. * (\d[10]) ') for root, dirs, files in Os.walk (LogPath): For line in Files: #遍历日志文件夹, merging all content into a file Pipei = Merge.match (line) if Pipei!= None:tmppath = root + ' \ \ ' +line logread1 = open (Tmppath, ' r ') Logread = Logread1.read () log2txt = open ('. \\log.txt ', ' a ') l Og2txt.write (Logread) log2txt.close () logread1.close () Else:exit log = open ('.//log.txt ', ' R ') Logre AD = Log.readlines () auditstring = Re.compile (R '. *[^_][ss][ee][ll][ee][cc][tt][^.]. *|. *[uu][nn][ii][oo][nn].*|. *[bb][aa][ss][ee][^.]. *|. *[oo][gg][nn][ll].*|. *[ee][vv][aa][ll][(].*|. *[ee][xx][cc][uu][tt][ee].* ') Writelog = open ('.//result.txt ', ' a ') for lines in Logread:auditresult = Auditstring.match (lines) If Auditresult!= None:writelog.write (Auditresult.group ()) Writelog.Write (' \ n ') else:exit writelog.close () log.close () 

Summarize

The above is the entire content of this article, I hope the content of this article for everyone's study or work can bring certain help, if you have questions you can message exchange.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.