ArticleDirectory
- 1.2.configure bcompare.exe to create a BAT file, which is scheduled on a daily basis. For details about how to use bcompare, see help.
- 2.1 register com DLL
- 2.2. Reference com
- 2.3. Core code
Recently, I used a website monitoring tool for chinamoocs. One of the main tasks is to analyze logs. I used AWStats for analysis and found that the data volume is large and the efficiency is too low, in addition, it is difficult to make specific analysis and statistics into the database. logparser is used to analyze nginx logs and store the statistics into the database. The detailed information is stored in Lucene for convenient query and statistics; nginx logs in every city every day are about 100 million records, and each records are recorded as follows:
5234229 of the records are extracted, and the analysis time is 1000 ms. The Lucene index time is 8667.606000005 Ms. The storage time is 3630.24890008 Ms. The extraction time is 848.597200004 Ms.
1. The specific ideas and related tools are as follows:
1.1. Because nginx is running on Linux, daily logs must be backed up and separated.Code(RHEL 5 ):
Logbackup. Sh -- compress the logs of the previous two days, back up the logs of the previous day, and restart nginx. I use corn to execute the logs at every day.
Logs_path = "/var/www/nginxlog /"
Date_dir =$ {logs_path} $ (date-d "-1 day" + "% Y")/$ (date-d "-1 day" + "% m ") /$ (date-d "-1 day" + "% d ")/
Gzip_date_dir =$ {logs_path} $ (date-d "-2 day" + "% Y")/$ (date-d "-2 day" + "% m ") /$ (date-d "-2 day" + "% d ")/
Mkdir-p $ {date_dir}
MV $ {logs_path} *. Log $ {date_dir}
Kill-HUP 'cat/var/www/nginxlog/nginx. PID'
/Usr/bin/gzip $ {gzip_date_dir} *. Log
1.2.configure bcompare.exe to create a BAT file, which is scheduled on a daily basis. For details about how to use bcompare, see help.
Autoparser. bat -- synchronize logs and analyze logs. scheduled tasks are scheduled to run at six o'clock every day on Windows.
"C: \ Program Files \ beyond compare \ bcompare.exe" "@ F: \ logs \ scripts \ nginx.txt"
"C: \ tools \ app \ logparsertool.exe" 2
F: \ logs \ scripts \ nginx.txt
Load "Sync nginxlog"
Sync update: Left-> right
2. logparser API analysis ideas
2.1 register com DLL
Regsvr32 "C: \ tools \ app \ logparser. dll"
2.2. Reference com
In this way, you can use the API.
2.3. Core code
Add using
Using logquery = msutil. logqueryclassclass;
Using ncsaloginputformat = msutil. comiisncsainputcontextclassclass;
Using logrecordset = msutil. ilogrecordset;
Query and read logs
Logquery ologquery = new logquery ();
Ncsaloginputformat oinputformat = new ncsaloginputformat ();
String query = @ "select remotehostname as clientip, User-Agent, datetime as time, statuscode as SC-status, bytessent as SC-bytes, Referer, request from {0 }";
Logrecordset orecordset = ologquery. Execute (string. Format (query, PATH), oinputformat );
For (;! Orecordset. atend (); orecordset. movenext ())
{
Msutil. ilogrecord o = orecordset. getrecord ();
String Req = system. Text. encoding. utf8.getstring (system. Text. encoding. getencoding ("gb2312"). getbytes (O. getvalue (6). tostring ()));
Int firstspace = Req. indexof (''), firsturlchar = Req. indexof ('/');
String method = Req. substring (0, firstspace );
String url = Req. substring (firsturlchar, req. lastindexof ("HTTP/")-firsturlchar );
String ext = string. empty;
String sfile = URL;
Int idx = URL. indexof ("? ");
If (idx! =-1 ){
Sfile = URL. substring (0, idx );
EXT = system. Io. Path. getextension (sfile );
} Else {
EXT = system. Io. Path. getextension (URL );
}
If (ext. length> 5 ){
EXT = "";
Sfile = URL;
}
Datetime dtnow1 = datetime. now;
Detailobject item = new detailobject (){
IP = (string) (O. getvalue (0) is system. dbnull )? "": O. getvalue (0 )),
Agent = (string) (O. getvalue (1) is system. dbnull )? "": O. getvalue (1 )),
Time = (datetime) O. getvalue (2 ),
Status = (INT) (O. getvalue (3) is system. dbnull )? 0: O. getvalue (3 )),
Size = (INT) (O. getvalue (4) is system. dbnull )? 0: O. getvalue (4 )),
Refer = (string) (O. getvalue (5) is system. dbnull )? "": O. getvalue (5 )),
Url = URL,
Method = method,
EXT = ext,
File = sfile
};
}
Note that the code is C # And logparser is used directly, which is 10 times less efficient.
What to do later is simple.
A piece of IP monitoring code is attached. When an IP address exceeds the threshold, we can see that item is detailobject.
# record IP address of region, which must be placed first
userviewobject sdiip;
string keyip = string. format ("IP _ {0 }_{ 1 }_{ 2}", item. IP, item. agent, item. time. tostring ("yyyymmdd");
If (dailyipitem. containskey (keyip)
{< br> sdiip = dailyipitem [keyip];
}< br> else
{< br> sdiip = new userviewobject ()
{< br> itemname = "ip",
IP = item. IP,
agent = item. agent,
day = item. time. date
};
dailyipitem. add (keyip, sdiip);
}< br> sdiip. PV ++;
sdiip. size = item. size;
# endregion