How to write a server log regular expression?

Source: Internet
Author: User
How to write a server log regular expression?
2013-06-23 04:33:51 W3SVC1539885 198.56.185.162 GET /robots.txt - 80 - 66.249.75.65 Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) 404 0 2

I want to match date 2013-06-23/time 04:33:51/server ip198.56.185.162/file address robots.txt/spider ip66.249.75.65/spider info Mozilla/5.0 + (compatible; + Googlebot/2.1; + + http://www.google.com/bot.html) /status code 404 0 2/. how can this be precisely matched?


Reply to discussion (solution)

It is best to give a specific example and give the expected result. your problem cannot be understood.

It is best to give a specific example and give the expected result. your problem cannot be understood.
The code is a specific example. I want to mark the value below, that is, to write a regular expression matching, use pregmatch to generate an array, and then I will do other things.

I think the date format should be fixed. you can separate them by space, for example

$log = '2013-06-23 04:33:51 W3SVC1539885 198.56.185.162 GET /robots.txt - 80 - 66.249.75.65 Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) 404 0 2';var_dump( explode(' ', $log) ); /**array(14) {  [0]=>  string(10) "2013-06-23"  [1]=>  string(8) "04:33:51"  [2]=>  string(12) "W3SVC1539885"  [3]=>  string(14) "198.56.185.162"  [4]=>  string(3) "GET"  [5]=>  string(11) "/robots.txt"  [6]=>  string(1) "-"  [7]=>  string(2) "80"  [8]=>  string(1) "-"  [9]=>  string(12) "66.249.75.65"  [10]=>  string(72) "Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html)"  [11]=>  string(3) "404"  [12]=>  string(1) "0"  [13]=>  string(1) "2"}*/

I think the date format should be fixed. you can separate them by space, for example

$log = '2013-06-23 04:33:51 W3SVC1539885 198.56.185.162 GET /robots.txt - 80 - 66.249.75.65 Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) 404 0 2';var_dump( explode(' ', $log) ); /**array(14) {  [0]=>  string(10) "2013-06-23"  [1]=>  string(8) "04:33:51"  [2]=>  string(12) "W3SVC1539885"  [3]=>  string(14) "198.56.185.162"  [4]=>  string(3) "GET"  [5]=>  string(11) "/robots.txt"  [6]=>  string(1) "-"  [7]=>  string(2) "80"  [8]=>  string(1) "-"  [9]=>  string(12) "66.249.75.65"  [10]=>  string(72) "Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html)"  [11]=>  string(3) "404"  [12]=>  string(1) "0"  [13]=>  string(1) "2"}*/


I think the date format should be fixed. you can separate them by space, for example
$log = '2013-06-23 04:33:51 W3SVC1539885 198.56.185.162 GET /robots.txt - 80 - 66.249.75.65 Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) 404 0 2';var_dump( explode(' ', $log) ); /**array(14) {  [0]=>  string(10) "2013-06-23"  [1]=>  string(8) "04:33:51"  [2]=>  string(12) "W3SVC1539885"  [3]=>  string(14) "198.56.185.162"  [4]=>  string(3) "GET"  [5]=>  string(11) "/robots.txt"  [6]=>  string(1) "-"  [7]=>  string(2) "80"  [8]=>  string(1) "-"  [9]=>  string(12) "66.249.75.65"  [10]=>  string(72) "Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html)"  [11]=>  string(3) "404"  [12]=>  string(1) "0"  [13]=>  string(1) "2"}*/

However, not every line of server logs is like this. There are a lot of logs starting with #, so I want to use a regular expression to filter out other formats.

This score cannot be wasted.

Log files are usually large.
You need to read data row by row in the loop and split it into arrays.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.