1, identify the search engine:
Before the "/etc/httpd/conf/httpd.conf" file "Logformat", add the following to determine whether the spider is crawling or real user access:
Setenvifnocase user-agent "(googlebot| mediapartners-google| baiduspider| Msnbot|sogou spider| sosospider| Yodaobot| yahoo| Yahoo) "Robot
2. Define Log format:
Add a row under "httpd.conf" File "Logformat" to set a new log format:
Logformat "%{%y-%m-%d%h:%m:%s}t%>s%V%H%b \%r" "%{user-agent}i\" "big
3, Record search engine log:
If more than one site is on the server, add the following line in "VirtualHost", otherwise add the following line under "Customlog" in httpd.conf:
Customlog Logs/weiyule.cn-robot Big Env=robot
The above is the second step of the definition of the log format, robot is the first step to determine whether the search engine variables.
4. Test the configuration file and reload the configuration file:
Httpd-t
Service httpd Reload
Note: If you want to generate Apache log files by log, you can write the following
Logformat "%h%l%u%t \%r\"%>s%b \ "%{referer}i\" \ "%{user-agent}i\" "combined
Customlog "|bin/rotatelogs.exe-l logs/www.111cn.net/access-%y-%m-%d.log 86400" combined
In this way, the Www.111cn.net directory under the Apache logs generates log files by date Access-2015-05-21.log Oh.