Use logparser for real-time network log (nginx) Analysis

Source: Internet
Author: User
ArticleDirectory
    • 1.2.configure bcompare.exe to create a BAT file, which is scheduled on a daily basis. For details about how to use bcompare, see help.
    • 2.1 register com DLL
    • 2.2. Reference com
    • 2.3. Core code

Recently, I used a website monitoring tool for chinamoocs. One of the main tasks is to analyze logs. I used AWStats for analysis and found that the data volume is large and the efficiency is too low, in addition, it is difficult to make specific analysis and statistics into the database. logparser is used to analyze nginx logs and store the statistics into the database. The detailed information is stored in Lucene for convenient query and statistics; nginx logs in every city every day are about 100 million records, and each records are recorded as follows:
5234229 of the records are extracted, and the analysis time is 1000 ms. The Lucene index time is 8667.606000005 Ms. The storage time is 3630.24890008 Ms. The extraction time is 848.597200004 Ms.

1. The specific ideas and related tools are as follows:

1.1. Because nginx is running on Linux, daily logs must be backed up and separated.Code(RHEL 5 ):

Logbackup. Sh -- compress the logs of the previous two days, back up the logs of the previous day, and restart nginx. I use corn to execute the logs at every day.
Logs_path = "/var/www/nginxlog /"
Date_dir =$ {logs_path} $ (date-d "-1 day" + "% Y")/$ (date-d "-1 day" + "% m ") /$ (date-d "-1 day" + "% d ")/
Gzip_date_dir =$ {logs_path} $ (date-d "-2 day" + "% Y")/$ (date-d "-2 day" + "% m ") /$ (date-d "-2 day" + "% d ")/

Mkdir-p $ {date_dir}
MV $ {logs_path} *. Log $ {date_dir}
Kill-HUP 'cat/var/www/nginxlog/nginx. PID'
/Usr/bin/gzip $ {gzip_date_dir} *. Log

1.2.configure bcompare.exe to create a BAT file, which is scheduled on a daily basis. For details about how to use bcompare, see help.

Autoparser. bat -- synchronize logs and analyze logs. scheduled tasks are scheduled to run at six o'clock every day on Windows.
"C: \ Program Files \ beyond compare \ bcompare.exe" "@ F: \ logs \ scripts \ nginx.txt"
"C: \ tools \ app \ logparsertool.exe" 2
F: \ logs \ scripts \ nginx.txt
Load "Sync nginxlog"
Sync update: Left-> right

2. logparser API analysis ideas

2.1 register com DLL

Regsvr32 "C: \ tools \ app \ logparser. dll"

2.2. Reference com

In this way, you can use the API.

2.3. Core code

Add using
Using logquery = msutil. logqueryclassclass;
Using ncsaloginputformat = msutil. comiisncsainputcontextclassclass;
Using logrecordset = msutil. ilogrecordset;
Query and read logs
Logquery ologquery = new logquery ();
Ncsaloginputformat oinputformat = new ncsaloginputformat ();
String query = @ "select remotehostname as clientip, User-Agent, datetime as time, statuscode as SC-status, bytessent as SC-bytes, Referer, request from {0 }";

Logrecordset orecordset = ologquery. Execute (string. Format (query, PATH), oinputformat );
For (;! Orecordset. atend (); orecordset. movenext ())
{
Msutil. ilogrecord o = orecordset. getrecord ();
String Req = system. Text. encoding. utf8.getstring (system. Text. encoding. getencoding ("gb2312"). getbytes (O. getvalue (6). tostring ()));
Int firstspace = Req. indexof (''), firsturlchar = Req. indexof ('/');
String method = Req. substring (0, firstspace );
String url = Req. substring (firsturlchar, req. lastindexof ("HTTP/")-firsturlchar );

String ext = string. empty;
String sfile = URL;
Int idx = URL. indexof ("? ");
If (idx! =-1 ){
Sfile = URL. substring (0, idx );
EXT = system. Io. Path. getextension (sfile );
} Else {
EXT = system. Io. Path. getextension (URL );
}
If (ext. length> 5 ){
EXT = "";
Sfile = URL;
}
Datetime dtnow1 = datetime. now;
Detailobject item = new detailobject (){
IP = (string) (O. getvalue (0) is system. dbnull )? "": O. getvalue (0 )),
Agent = (string) (O. getvalue (1) is system. dbnull )? "": O. getvalue (1 )),
Time = (datetime) O. getvalue (2 ),
Status = (INT) (O. getvalue (3) is system. dbnull )? 0: O. getvalue (3 )),
Size = (INT) (O. getvalue (4) is system. dbnull )? 0: O. getvalue (4 )),
Refer = (string) (O. getvalue (5) is system. dbnull )? "": O. getvalue (5 )),
Url = URL,
Method = method,
EXT = ext,
File = sfile
};
}
Note that the code is C # And logparser is used directly, which is 10 times less efficient.

What to do later is simple.

A piece of IP monitoring code is attached. When an IP address exceeds the threshold, we can see that item is detailobject.

# record IP address of region, which must be placed first
userviewobject sdiip;
string keyip = string. format ("IP _ {0 }_{ 1 }_{ 2}", item. IP, item. agent, item. time. tostring ("yyyymmdd");
If (dailyipitem. containskey (keyip)
{< br> sdiip = dailyipitem [keyip];
}< br> else
{< br> sdiip = new userviewobject ()
{< br> itemname = "ip",
IP = item. IP,
agent = item. agent,
day = item. time. date
};
dailyipitem. add (keyip, sdiip);
}< br> sdiip. PV ++;
sdiip. size = item. size;
# endregion

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.