Apache log processing under Linux: Record search engine crawl

Source: Internet
Author: User
Tags apache log

1, identify the search engine:

Before the "/etc/httpd/conf/httpd.conf" file "Logformat", add the following to determine whether the spider is crawling or real user access:

Setenvifnocase user-agent "(googlebot| mediapartners-google| baiduspider| Msnbot|sogou spider| sosospider| Yodaobot| yahoo| Yahoo) "Robot

2. Define Log format:

Add a row under "httpd.conf" File "Logformat" to set a new log format:

Logformat "%{%y-%m-%d%h:%m:%s}t%>s%V%H%b \%r" "%{user-agent}i\" "big

3, Record search engine log:

If more than one site is on the server, add the following line in "VirtualHost", otherwise add the following line under "Customlog" in httpd.conf:

Customlog Logs/weiyule.cn-robot Big Env=robot

The above is the second step of the definition of the log format, robot is the first step to determine whether the search engine variables.

4. Test the configuration file and reload the configuration file:

Httpd-t
Service httpd Reload

Note: If you want to generate Apache log files by log, you can write the following

Logformat "%h%l%u%t \%r\"%>s%b \ "%{referer}i\" \ "%{user-agent}i\" "combined
Customlog "|bin/rotatelogs.exe-l logs/www.111cn.net/access-%y-%m-%d.log 86400" combined

In this way, the Www.111cn.net directory under the Apache logs generates log files by date Access-2015-05-21.log Oh.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.