Disclaimer: The optimization measures in this article are sourced from the awstats official website and blogs of other friends (if you have any offense, please forgive me and thank you for your spirit of sharing the internet). The optimization process will be applied on your own. 1. The current website has approximately 0.5 million visitors (500,000 visitsmonth) each month. The nginxlog size per day is 62 MB (about 0.3 million records) and awstats is used.
Disclaimer: the source of optimization measures in this article is aw.StatS official website and blogs of other friends (if you have any offense, please forgive me and thank you for your spirit of sharing the internet). The optimization process is applied to your own applications.
I. Status quo
The website has approximately 0.5 million visitors (500,000 visits/month) each month. The daily nginx log size is 62 MB (about 0.3 million records). It takes at least 2 hours to use awstats for analysis, and the performance is very low, the speed is very slow and does not match the title of AWStats's ten-million-level log solution.
Ii. System Environment
Xeon (R) quad-core E5504 2.00 GHz CPU
Memory 3 GB
PERL5.10 (Latest Version 5.14)
AWStats 6.7 (latest version 7.0)
AWStats performance benchmark: http://awstats.sourceforge.net/docs/awstats_beNcHmark.html
Iii. Optimization Measures
1. Disable reverse dns lookup (DNSLookup = 0) in the awstats configuration file ).
DNSLookUp is used to query the visitor's domain/country information based on the visitor's ip address. Generally, DNS queries are slow, depending on the network environment and system configuration. Disabling DNSlookup will save 99% of the analysis time. In actual testing, 62 m logs (0.3 million records) are analyzed for more than two hours if dnslookup is enabled, and the analysis time when dnslookup is disabled is 1 minute, which greatly shortens the analysis time. The loss of dnslookup is that the visitor's country information cannot be obtained. awstats officially recommends that you use a more precise geoip plug-in to replace dnslookup.
2. Disable URLWithQuery, URLReferrerWithQuery, and URLWithAnchor. awstats disables these three options by default.
3. Upgrade the perl version (perl 5.8 is 5.6 faster than perl 5%) and replace ActiveState with the standard perl release version (because of Memory leakage in ActiveState, the analysis will become slower and slower, so that the last row cannot be analyzed ).
4. The Rotate your log divides the log into smaller parts and usesCrontabAppropriately increasing the analysis frequency can make awstats analysis faster.
5. Upgrade awstats (AWStats 6.0 is 5.9 faster than awstats 15% ).
6. ensure the integrity of the HostAliases parameter in the awstats configuration file.
7. Use zCatDirectly read the .gz file and filter out images, js files, css files, and other files.
The details are as follows:
./Awstats. pl-upDate-Config = test-LogFile = "/bin/zcat test.log.gz |Grep-V '.gif/|. png/|. jpg/|. js/|. css '|"
8. Modify the awstats. pl file. The default value of $ LIMITFLUSH is 5000, which can be increased according to the memory of the server. I increased it to 100000, and the performance will be improved.
# Vi/usr/lib/cgi-bin/awstats. pl
38 $ LIMITFLUSH = 5000; # Nb of records in data arrays after how we neEdTo flush data on disk
Iv. optimization results
Before optimization, it takes at least two hours to analyze 0.3 million records. After dnslookup is disabled, it takes one minute to analyze 0.3 million records.
The optimized results can meet the current needs. After other optimization measures are modified, the effect is not obvious.
V. performance bottleneck
Awstats has many abundant statistical indicators: Spider identification browser recognition,File. Html 'target = '_ blank'>File TypeStatistics, etc. The beautiful analysis result page is beyond the reach of Webalizer and analog, which will certainly consume a longer analysis time to a certain extent.
However, for a website with a large traffic volume, it is very painful to use awstats for analysis. Awstats officially recommends that you use an Analog or Webalizer with fewer statistical indicators and faster speeds when the access volume exceeds 4,000,000 visits/month.