Content Summary: Awstats's use introduction and configuration some improvement instructions. It's nice to see the start of the Awstats 6.3: Chinese users have basically just to have the loadplugin= "Decodeutfkeys" enabled in the configuration file basically there is no Chinese search engine statistics problem, now increased # Minor Chinese Search Engines ' baidu\.com ', ' search\.sina\.com ', ' search\.sohu\.com ', these 3 search engines. contains the main search engines and spiders defined by the domestic patch (after the lib\ directory to cover the original program directory can be)
The log statistics system plays an important role in the user behavior analysis of the site, especially for the keyword access statistics from search engines: it is a very effective source of user behavior analysis data. With the development of Internet for many years, the tool of Web log statistics is more and more mature, and the function is more and more rich. Many of them are open source, Awstats is a very good one.
awstats:advanced Web Statistics
AWStats is a Perl-based Web log analysis tool that develops quickly on the Sourceforge . Compared to another very good open source log analysis tool Webalizer, Awstats has the advantage of:
- Interface friendly: You can directly invoke the corresponding language interface according to the browser (Simplified Chinese version)
Reference output Sample:Http://www.chedong.com/cgi-bin/awstats/awstats.pl?config=chedong
- Based on Perl: and solves the cross-platform problem well, the system itself can run on Gnu/linux or Windows (after installing ActivePerl ), and the parsed log directly supports the Apache format (combined) and the IIS format ( Need to be modified). Webalizer Although there are Windows platform version , but there is a lack of maintenance;
Awstats can be implemented with a set of systems to complete a different Web server on its own site: Gnu/linux/apache and Windows/iis Server Unified statistics.
- High efficiency: Awstats output statistics items than Webalizer rich a lot, the speed can still reach about 1/3 of the Webalizer, for a day visits millions site, this speed is enough;
- Configuration/Customization Convenience: The system provides flexible but the default is very reasonable configuration rules, the need to modify the default configuration of no more than 3, 4 can start running, and modify and expand the plug-ins are more;
- Awstats's designers are designed for precise "Human visits", so many search engines have been filtered out of robot access, so it is possible to lower the number of statistics than other log statistics tools, and access from within the company can be filtered through IP filtering settings.
- Provides a number of extended parameter statistics: Using the EXTRAXXXX series configuration to generate parametric analysis for specific applications is useful for product analysis.
More with other tools: Webalizer, analog comparison please refer to:
Awstats Installation Memo
The Awstats mode of operation is this:
- Analysis log: After the run, this log statistic results are archived to a Awstats database (plain text);
- And then the output: in two different forms
- One is to read the statistic result database output through the CGI program;
- One is to run the background script to export the output into a static file;
Here are 2 statistics examples for individual site logs:
One is the output of the CGI way on the Gnu/linux,
One is based on static page export on Windows 2000
http://sourceforge.net/projects/awstats/After downloading the installation package:
Gnu/linux:tar ZXF awstats-version.tgz
Awstats scripts and static files default to the Wwwroot directory: All files under the Cgi-bin directory are deployed awstats.pl programs to/home/apache/cgi-bin/awstats/
#把图标等文件目录复制到WEB的HTML文件发布目录下: Published under/home/apache/htdocs/
More batch update scripts can be placed in the cgi-bin/awstats/directory under the Tools directory.
Windows 2000: Run in the background script mode, unpack directly, and move to the D:\AWStats directory
Copy icons icon directory to IIS publishing directory: Inetpub/icon
Data source log format and days-by-day truncation rules
- For Apache: Log format: Set to combined format, log truncation is a bit cumbersome: you need to install the cronolog tool to set the log to truncate by day:
Customlog "|/usr/local/sbin/cronolog/path/to/apache/logs/access_log.%y%m%d" combined
For example: logs/access_log.20030326
The log is a compressed format and can be used gzip-d
- For IIS: The default has a better log-by-day truncation rule, but the log format for IIS is less appropriate for Awstats statistics.
So it's best to get rid of all the log fields, and then strictly follow these list settings
Compared to the IIS default settings:
- Dates Date
- Client IP Address C-ip
- User name Cs-username
- Method Cs-method
- URI Resource Cs-uri-stem
- Protocol Status Sc-status
- Number of bytes sent Sc-bytes
- Protocol version Cs-version
- User Agent CS (user-agent)
- Reference CS (Referer)
The reduction is:
The additions are:
- Server IP Address
- Server port
- URI Query
- Number of bytes Sent
- Protocol version
Configuration file naming rules: awstats.sitename.conf
Awstats's main program awstats.pl automatically invokes the corresponding site's configuration file based on the site name: awstats.sitename.conf
For example: Run./awstats.pl-config=chedong call is the awstats.chedong.conf configuration file in the same directory;
If-config is not specified, the awstats.conf or/etc/awstats.conf in the current directory is also found as the default profile.
So it's best to rename the default awstats.model.conf to Awstats.yoursite.conf, for example: awstats.chedong.conf,
For multiple site statistics, the Awstats configuration file contains features that are very useful, we can put the generic configuration in one document, and then use (support after version 5.4) include configuration to include the generic configuration in the head of each specific profile. The corresponding properties in the generic configuration are then overwritten with other configurations, such as:
Minimal configuration file modification: LogFile sitedomain Logformat
For the gnu/linux on the statistics Apache log only need to modify: LogFile sitedomain these 2 options
- Gnu/linux logfile= "/path/to/apache/logs/access_log.%yyyy-24%mm-24%dd-24"
Windows logfile= "D:\iis_logs\W3SV3\ex%YY-24%MM-24%DD-24.log"
This configuration means the log file name that is spelled out in the year, month and date 24 hours ago;
- Sitedomain= "Www.chedong.com"
The name of the site, the default is empty, and if it is empty, Awstats will refuse to run;
- You need to modify one more for the statistics IIS log:
The default value is the 1:apache log, 2 is the IIS log
Other things to look out for:
Awstats default does not filter SWF files, the. swf is counted as PageView, so if the SWF file on the site is mainly advertising, it is best to filter out:
will automatically invoke Awstats. Chedong. conf This configuration file
Log statistics run automatically
Gnu/linux: crontab-e: Run 8:10 every day
8 * * * (cd/path/to/apache/cgi-bin/awstats/;./awstats.pl-update-config=chedong)
On Windows 2000: set to run 8:10 every day
D:\Perl\bin\perl.exe d:\awstats\tools\awstats_buildstaticpages.pl-update-config=chedong-lang=cn-dir=c:\inetpub\ Awstats\-awstatsprog=d:\awstats\wwwroot\cgi-bin\awstats.pl
Multi-site Logging statistics
Awstats has a batch tool: tools/awstats_updateall.pl, which can be used to traverse all the configuration files in a directory and run statistics in batches. So the rest of the work is mainly the synchronization of the log problem.
For multiple sites, many configuration options are duplicated, and if each profile is modified and maintained to be cumbersome, Awstats provides the configuration file's functionality from version 5.4, so we can configure a generic configuration, such as: common.conf
The configuration for the other sites is then set to: You can override the default and inconsistent configuration by following the options.
Description of statistical indicators
- Visitors: According to the visitors do not repeat the IP statistics, an IP represents a visitor;
- Number of visits: A visitor may visit several times within 1 days (e.g., once in the morning, once in the afternoon), so the number of visitors will be counted for a certain period of time (for example: 1 hours), the number of IP numbers not duplicated;
- Bytes: The total data flow to the client;
- Data from REFERER: the reference (REFERER) field in the log, which records the address before accessing the corresponding Web page, so if the user clicks into the site through search engine results, there will be a user's query address in the corresponding search engine. This address can be extracted by parsing the keywords used by the user query:
2003-03-26 15:43:58 22.214.171.124-get/index.html http/1.1 mozilla/4.0+ (compatible;+msie+5.01;+windows+nt+ 5.0) Http://www.google.com/search?q=chedong
Awstats in the search engine key phrases and keyword statistics of the function is relatively complete: the world can identify more than 300 kinds of machine crawler, and can identify most of the mainstream international search engines and many regions of the local language search engine.
Geo-information-based plug-in installation:
GeoIP and Geo::ipfree (Awstats 5.5+)
GeoIP and Geo::ipfree are free of state/IP mapping tables, which are more accurate and faster than DNS-resolved domain names. The GeoIP API is free, the default library is free, and the fee is for its data update service. Geo::ipfree not only is the code public, but the library data is also public.
First download C library:GeoIP C after the package
Then download the Perl library: afterGeoIP perl Unpack
%perl makefile.pl; Make
After downloading geo::ipfree unpack
Configuration: By enabling GeoIP related Plug-ins in the configuration file:
loadplugin= "GeoIP Geoip_standard/home/apache/chedong.com/cgi-bin/awstats/geoip.dat"
loadplugin= "Geoip_city_maxmind Geoip_standard/home/apache/chedong.com/cgi-bin/awstats/geolitecity.dat"
Maxmind currently offers free GeoIP and Geoipcitylite packets: can be downloaded from the following address on a regular basis every month