Use PHP as a shell script (reprint)
Task: Filter out the Apache access logs for 2010-08-18 and place them on the local database.
Solution: Write two PHP files to solve this problem
Assume that the Linux system
Assume full Utf-8
Assuming PHP is already in $path.
If there is such a log/site/data/log/access_log_20100818, the content example is as follows:
[120.42.16.230] [-] [-] [2010-08-17 08:36:41] [GET] [www.site.com] [/membercenter/ordinary/score] [] [http/1.1] [200] [25 [] [-] [mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; trident/4.0; Qqdownload 618; GTB6.5; 360SE)][121.229.144.193] [-] [-] [2010-08-17 08:36:41] [GET] [www.site.com] [/bbs/jiehunzhenhao/wosikainv_49602.html] [] [http/1.1] [] [12631] [HTTP://WWW.SITE.COM/BBS/FORUM/JIEHUNZHENHAO/FILTER/0/ORDERBY/2/ASCDESC/DESC/PAGE/4] [ mozilla/5.0 (Windows; U Windows NT 5.1; ZH-CN; rv:1.9.2.8) gecko/20100722 firefox/3.6.8][121.229.144.193] [-] [-] [2010-08-17 08:36:41] [POST] [www.site.com] [/ Bbsmanage/moderatorsetajax] [] [http/1.1] [+] [] [http://www.site.com/bbsmanage/moderatorset?id=4650] [mozilla/ 4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; CIBA; 360SE)][60.190.125.3] [-] [-] [2010-08-17 08:36:41] [GET] [www.site.com] [/bbs/fangchanzatan/jiangjiatong_49458.html] [] [http/1.1] [] [10435] [http://www.site.com/membercenter/ordinary/bbssend?page=6] [mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; trident/4.0; SE 2.X)][118.120.207.138] [-] [-] [2010-08-17 08:36:41] [GET] [www.site.com] [/bbs/jingcaitietu/tianshangrenjian_ 51533.html] [[] [http/1.1] [] [13418] [http://www.site.com/bbs/forum/jingcaitietu/] [mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Qqdownload 627; GTB6.5; mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1);. NET CLR 2.0.50727)][121.229.144.193] [-] [-] [2010-08-18 08:36:41] [GET] [www.site.com] [/bbsmanage/setmoderator] [] [http/1.1] [] [451] [Http://www.site.com/mange/magframe] [mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; CIBA; 360SE)][121.229.144.193] [-] [-] [2010-08-18 08:36:42] [POST] [www.site.com] [/bbsmanage/moderatorxml] [] [http/1.1] [ [3699] [Http://www.site.com/bbsmanage/setmoderator] [mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; CIBA; 360SE)][60.211.96.212] [-] [-] [2010-08-18 08:36:42] [GET] [www.site.com] [/member/index/id/7651] [] [http/1.1] [200] [53 [/HTTP]Www.site.com/membercenter/ordinary/friend] [mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; SE 2.X;. NET CLR 2.0.50727;. NET CLR 4.0.20506)][113.205.59.70] [-] [-] [2010-08-18 08:36:43] [POST] [www.site.com] [/regi Ster/checkcaptcha] [] [http/1.1] [+] [] [http://www.site.com/register/ordinary/member_id/8326] [mozilla/4.0 ( Compatible MSIE 6.0; Windows NT 5.1; SV1;. NET CLR 2.0.50727;. NET CLR 3.0.04506.648)][123.4.197.242] [-] [-] [2010-08-18 08:36:43] [GET] [www.site.com] [/BBSO Perate/tuijian] [act=tuijian&id=33936] [http/1.1] [] [4448] [-] [mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Cncdialer)] .....
Of course it's big, hundreds of M.
The contents of the shell_filter.php file are as follows:
#!/usr/local/php/bin/php
1) $date = $argv [1];else $date = ' 2010-01-01 '; Iteration $j =0;while (!feof ($handle)) { $buffer = fgets ($handle); Process ($buffer);} Closes the input stream and ends fclose ($handle);//filter processing function process ($STR) { global $j; Global $date; $str = Strval ($STR); $str = Trim ($STR); $str = preg_replace (' #\n|\r\n# ', "", $str); First make sure that the log format is compliant if (Preg_match (' #\[.*?\] \[.*?\] \[.*?\] \[.*?\] \[.*?\] \[.*?\] \[.*?\] \[.*?\] \[.*?\] \[.*?\] \[.*? \] \[.*?\] \[.*?\]# ', $str)) { if (!preg_match (' #::1# ', $str)) {//This is a useless record if (Preg_match (' # ') $date. ' # ', $str)) {//key points, matching $j + +; Echo $str. "\ n"; Here through the pipeline output to the next file } } }}?>
The file save_echo.php content is as follows:
#!/usr/local/php/bin/php
$arr [0], ' access_time ' = $arr [1], ' get_post ' + $arr [2], ' httphost ' = $arr [3], ' url ' = $arr [4], ' http_type ' and ' = ' $arr [5], ' code ' + $arr [6], ' length ' and ' $arr [7], ' Source ' = $arr [8], ' Agent ' = substr ($arr [9],0,], ' engine_name ' + $engine _name, ); $db->insert (' table1 ', $result); This is just output to the console to show the echo $i. ': '. $arr [1]. ' '. $arr [0]. "\ n";}? >
At last
Go to the directory where two PHP files are located,
cat/site/data/log/access_log_20100812 | PHP shell_filter.php 2010-08-18|php save_echo.php
Explain:
Cat output log file contents, buffer, machine automatic processing
Input of pipeline to shell_filter.php file
The shell_filter.php file intercepts the 2010-08-18 record and outputs it, and if you wish, you can change the date by any date.
Input of pipeline to save_echo.php file
The save_echo.php file is saved to the database and has a console output prompt.