Recently in the Apache log analysis, installed Awstats, these two days to observe,
Report date Month January 2010
First Visit Date January 12, 2010 11:04
Last Visit Date January 13, 2010 23:59
Visitors Visitors page number file bytes
Browser Flow * 77 226 (2.93 Visitors/visitors) 508979 (2252.11 pages/visit) 509492 (2254.38 files/visit) 13.67 G bytes (63430.28 K Bytes/visit)
Non-browser traffic * 117312 122716 736.24 m bytes
The result here is very puzzling, through the Google Statistics website independent IP traffic has 2W to do, the number shown here is far from the difference. The number of pages in the back and the number of documents are consistent with reality. After looking for the reason found that the Apache log recorded IP address is not correct, most of them are CDN node address. The reason for this is obviously due to the CDN, before the Web site in the background program to read the user IP address also appeared similar problems. You can use Print_r ($_server) (PHP language) to find the real user IP address, this site is $_server[' Http_cdn-src-ip '. This is the real client IP address that the CDN is carrying (this is not a matter of whether the user is using a proxy). But how do you use this value in Apache log records? I found in Google and Baidu for a long time did not find the appropriate information or to say the solution, had to think for themselves.
Take a closer look at the related configuration of log records in Apache for Logformat:
Logformat "%h%l%u%t/"%r/"%>s%b/"%{referer}i/"/"%{user-agent}i/"" combined "
I think about how the%{referer} and%{user-agent} are obtained, and these two are often used in the program, the client sends the request when both messages are sent to the server as headers. You later looked at all the header information for the next access, as follows:
Copy Code code as follows:
Array
(
[Cdn-src-ip] => 222.44.46.58
[Accept] => image/gif, Image/jpeg, Image/pjpeg, Image/pjpeg, Application/x-shockwave-flash, application/ Vnd.ms-excel, Application/vnd.ms-powerpoint, Application/msword, */*
[Accept-language] => ZH-CN
[User-agent] => mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; trident/4.0; CIBA. NET CLR 2.0.50727)
[Host] => www.875.cn
[Cookie] => __utma=217127135.1188793388.1263188369.1263364666.1263368206.5; __utmz=217127135.1263368206.5.2.utmcsr=211.167.92.250|utmccn= (Referral) |utmcmd=referral|utmcct=/cgi-bin/ awstats/awstats.pl; viewedshopsid=621; viewedshopspp=%u6b27%u5c1a%u574a
[Accept-encoding] => gzip
[Via] => 1.1 hnay40:80 (Cdn Cache Server V2.0)
[Connection] => keep-alive
)
Of course Rerfer will also have rerfer information appears in the header information, there are cookies, Host, user-agent and other information, which can be used in the Apache configuration file can use variables, Of course here Cdn-src-ip is exactly the real IP address of the customer I want. So presumably ${cdn-src-ip} should also be available in the logging format. The following I should mean ignoring the case. And then come to a solution:
Add a new Logformat information
Copy Code code as follows:
Logformat "%{cdn-src-ip}i%l%u%t/"%r/"%>s%b/"%{referer}i/"/"%{user-agent}i/"" Combinedcdn "
Add to the Web site configuration to record:
Copy Code code as follows:
Customlog "|/usr/local/sbin/cronolog/usr/local/apache/logs/www.875.cn-access_log.%y%m%d" Combinedcdn env=! IMAGES
Restart the Apache service and then visit the Web site to view the log records and find that the client IP address is now properly logged.