Awstats Introduction: Apache/iis Log Analysis Tool _ website application

Source: Internet
Author: User
Tags geoip geoip api unpack apache log server port maxmind

You don't have to have the patience to read everything: Brief installation instructions are as follows
install
http://sourceforge.net/projects/awstats/Download installation package:
gnu/ Linux:tar zxf awstats-version.tgz
awstats scripts and static files default to the Wwwroot directory: All files under the Cgi-bin directory are deployed to the cgi-bin/directory:/home/apache/ cgi-bin/awstats/
MV Awstats-version/cgi-bin/path/to/apache/cgi-bin/awstats
Copies the file directories, such as icons, to the Web's HTML file publishing directory, for example: /home/apache/htdocs/release
More batch update scripts, etc. under the Tools directory, can be placed under the cgi-bin/awstats/directory
to upgrade the main Search engine and Spider definition , install GEOIP application Library: C
http://www.maxmind.com/download/geoip/api/c/Unpack, compile and install
perl-mcpan-e ' install ' Geo::ip "' or using a pure Perl package   perl-mcpan-e ' Install" GEO::IP::P ureperl "'
Download Geoip/geoipcitylite package: Unpack and deploy to Awstats directory
http://www.maxmind.com/download/geoip/database/geolitecity.dat.gz
http:// www.maxmind.com/download/geoip/database/GeoIP.dat.gz

Configuration
Name the default awstats.model.conf as Common.conf
Modify some of these configuration options:
Loadplugin= "Decodeutfkeys"
loadplugin= "GeoIP Geoip_standard/home/apache/chedong.com/cgi-bin/awstats/geoip.dat"
loadplugin= "Geoip_city_maxmind Geoip_standard/home/apache/chedong.com/cgi-bin/awstats/geolitecity.dat"

Create Awstats under Create: Data directory for statistical output

Follow the sample Setup profile:
Include "Common.conf"
Logfile= "/home/apache/logs/access_log.%yyyy-24%mm-24%dd-24"
Sitedomain= "Www.chedong.com"
Hostaliases= "Chedong.com"
Defaultfile= "Index.html"
Dirdata= "/home/apache/cgi-bin/awstats/data/"

Content Summary: Awstats's use introduction and configuration some improvement instructions. It's nice to see the start of the Awstats 6.3: Chinese users have basically just to have the loadplugin= "Decodeutfkeys" enabled in the configuration file basically there is no Chinese search engine statistics problem, now increased # Minor Chinese Search Engines ' baidu\.com ', ' search\.sina\.com ', ' search\.sohu\.com ', these 3 search engines. contains the main search engines and spiders defined by the domestic patch (after the lib\ directory to cover the original program directory can be)

The log statistics system plays an important role in the user behavior analysis of the site, especially for the keyword access statistics from search engines: it is a very effective source of user behavior analysis data. With the development of Internet for many years, the tool of Web log statistics is more and more mature, and the function is more and more rich. Many of them are open source, Awstats is a very good one.

awstats:advanced Web Statistics

AWStats is a Perl-based Web log analysis tool that develops quickly on the Sourceforge . Compared to another very good open source log analysis tool Webalizer, Awstats has the advantage of:

    1. Interface friendly: You can directly invoke the corresponding language interface according to the browser (Simplified Chinese version)
      Reference output Sample:Http://www.chedong.com/cgi-bin/awstats/awstats.pl?config=chedong
    2. Based on Perl: and solves the cross-platform problem well, the system itself can run on Gnu/linux or Windows (after installing ActivePerl ), and the parsed log directly supports the Apache format (combined) and the IIS format ( Need to be modified). Webalizer Although there are Windows platform version , but there is a lack of maintenance;
      Awstats can be implemented with a set of systems to complete a different Web server on its own site: Gnu/linux/apache and Windows/iis Server Unified statistics.
    3. High efficiency: Awstats output statistics items than Webalizer rich a lot, the speed can still reach about 1/3 of the Webalizer, for a day visits millions site, this speed is enough;
    4. Configuration/Customization Convenience: The system provides flexible but the default is very reasonable configuration rules, the need to modify the default configuration of no more than 3, 4 can start running, and modify and expand the plug-ins are more;
    5. Awstats's designers are designed for precise "Human visits", so many search engines have been filtered out of robot access, so it is possible to lower the number of statistics than other log statistics tools, and access from within the company can be filtered through IP filtering settings.
    6. Provides a number of extended parameter statistics: Using the EXTRAXXXX series configuration to generate parametric analysis for specific applications is useful for product analysis.

More with other tools: Webalizer, analog comparison please refer to:
http://awstats.sourceforge.net/#COMPARISON

Awstats Installation Memo

The Awstats mode of operation is this:

    1. Analysis log: After the run, this log statistic results are archived to a Awstats database (plain text);
    2. And then the output: in two different forms
        • One is to read the statistic result database output through the CGI program;
        • One is to run the background script to export the output into a static file;

      Here are 2 statistics examples for individual site logs:
      One is the output of the CGI way on the Gnu/linux,
      One is based on static page export on Windows 2000

      Download/install

      http://sourceforge.net/projects/awstats/After downloading the installation package:

      Gnu/linux:tar ZXF awstats-version.tgz
      Awstats scripts and static files default to the Wwwroot directory: All files under the Cgi-bin directory are deployed awstats.pl programs to/home/apache/cgi-bin/awstats/
      MV Awstats-version/cgi-bin/path/to/apache/cgi-bin/awstats
      #把图标等文件目录复制到WEB的HTML文件发布目录下: Published under/home/apache/htdocs/
      More batch update scripts can be placed in the cgi-bin/awstats/directory under the Tools directory.

      Windows 2000: Run in the background script mode, unpack directly, and move to the D:\AWStats directory
      Copy icons icon directory to IIS publishing directory: Inetpub/icon

      Data source log format and days-by-day truncation rules

        1. For Apache: Log format: Set to combined format, log truncation is a bit cumbersome: you need to install the cronolog tool to set the log to truncate by day:
          Customlog "|/usr/local/sbin/cronolog/path/to/apache/logs/access_log.%y%m%d" combined
          For example: logs/access_log.20030326
          The log is a compressed format and can be used gzip-d
        2. For IIS: The default has a better log-by-day truncation rule, but the log format for IIS is less appropriate for Awstats statistics.
          So it's best to get rid of all the log fields, and then strictly follow these list settings
          • Dates Date
          • Time
          • Client IP Address C-ip
          • User name Cs-username
          • Method Cs-method
          • URI Resource Cs-uri-stem
          • Protocol Status Sc-status
          • Number of bytes sent Sc-bytes
          • Protocol version Cs-version
          • User Agent CS (user-agent)
          • Reference CS (Referer)
          Compared to the IIS default settings:
          The reduction is:
          • Server IP Address
          • Server port
          • URI Query
          The additions are:
          • Number of bytes Sent
          • Protocol version
          • Reference

      Configuration file naming rules: awstats.sitename.conf

      Awstats's main program awstats.pl automatically invokes the corresponding site's configuration file based on the site name: awstats.sitename.conf
      For example: Run./awstats.pl-config=chedong call is the awstats.chedong.conf configuration file in the same directory;
      If-config is not specified, the awstats.conf or/etc/awstats.conf in the current directory is also found as the default profile.
      So it's best to rename the default awstats.model.conf to Awstats.yoursite.conf, for example: awstats.chedong.conf,

      For multiple site statistics, the Awstats configuration file contains features that are very useful, we can put the generic configuration in one document, and then use (support after version 5.4) include configuration to include the generic configuration in the head of each specific profile. The corresponding properties in the generic configuration are then overwritten with other configurations, such as:
      Include= "Common.conf"
      Logfile= "/path/to/bbs/access_log"
      Sitename= "Bbs.chedong.com"

      Minimal configuration file modification: LogFile sitedomain Logformat

      For the gnu/linux on the statistics Apache log only need to modify: LogFile sitedomain these 2 options

        1. Gnu/linux logfile= "/path/to/apache/logs/access_log.%yyyy-24%mm-24%dd-24"
          Windows logfile= "D:\iis_logs\W3SV3\ex%YY-24%MM-24%DD-24.log"
          This configuration means the log file name that is spelled out in the year, month and date 24 hours ago;
        2. Sitedomain= "Www.chedong.com"
          The name of the site, the default is empty, and if it is empty, Awstats will refuse to run;
        3. You need to modify one more for the statistics IIS log:
          logformat=2
          The default value is the 1:apache log, 2 is the IIS log

      Other things to look out for:
      Awstats default does not filter SWF files, the. swf is counted as PageView, so if the SWF file on the site is mainly advertising, it is best to filter out:

      Log analysis

      ./AWSTATS.PL-UPDATE-CONFIG=SITENAME-LANG=CN
      For example:./awstats.pl-update-config=Chedong
      will automatically invoke Awstats. Chedong. conf This configuration file

      Statistical output

      Gnu/linux Http://localhost/cgi-bin/awstats/awstats.pl?config=chedong
      Windows http://localhost/awstats/awstats.chedong.html

      Log statistics run automatically

      Gnu/linux: crontab-e: Run 8:10 every day
      #update Awstats
      8 * * * (cd/path/to/apache/cgi-bin/awstats/;./awstats.pl-update-config=chedong)

      On Windows 2000: set to run 8:10 every day
      D:\Perl\bin\perl.exe d:\awstats\tools\awstats_buildstaticpages.pl-update-config=chedong-lang=cn-dir=c:\inetpub\ Awstats\-awstatsprog=d:\awstats\wwwroot\cgi-bin\awstats.pl

      Multi-site Logging statistics

      Awstats has a batch tool: tools/awstats_updateall.pl, which can be used to traverse all the configuration files in a directory and run statistics in batches. So the rest of the work is mainly the synchronization of the log problem.

      For multiple sites, many configuration options are duplicated, and if each profile is modified and maintained to be cumbersome, Awstats provides the configuration file's functionality from version 5.4, so we can configure a generic configuration, such as: common.conf

      The configuration for the other sites is then set to: You can override the default and inconsistent configuration by following the options.
      Awstats.bbs.chedong.conf
      Include "Chedong.common.conf"
      LogFile "/path/to/bbs_log"
      SiteName "Bbs.chedong.com"

      Awstats.www.chedong.conf
      Include "Chedong.common.conf"
      LogFile "/path/to/www_log"
      SiteName "Www.chedong.com"
      Hostaliases= "Chedong.com"

      Description of statistical indicators

        • Visitors: According to the visitors do not repeat the IP statistics, an IP represents a visitor;
        • Number of visits: A visitor may visit several times within 1 days (e.g., once in the morning, once in the afternoon), so the number of visitors will be counted for a certain period of time (for example: 1 hours), the number of IP numbers not duplicated;
        • Number of pages: Excluding pictures, CSS, JavaScript files, such as the total number of page access, but if a page uses more than one frame, each frame is counted as a page request;
        • Number of files: The total number of file requests from the browser client, including pictures, Css,javascript, etc., users request a page is, if the page contains pictures, etc., so the server will issue multiple file requests, the number of files is generally far larger than the number of documents;
        • Bytes: The total data flow to the client;
        • Data from REFERER: the reference (REFERER) field in the log, which records the address before accessing the corresponding Web page, so if the user clicks into the site through search engine results, there will be a user's query address in the corresponding search engine. This address can be extracted by parsing the keywords used by the user query:
          Like what:
          2003-03-26 15:43:58 123.123.123.123-get/index.html http/1.1 mozilla/4.0+ (compatible;+msie+5.01;+windows+nt+ 5.0) Http://www.google.com/search?q=chedong
          Awstats in the search engine key phrases and keyword statistics of the function is relatively complete: the world can identify more than 300 kinds of machine crawler, and can identify most of the mainstream international search engines and many regions of the local language search engine.

      Hacking AWStats

      Geo-information-based plug-in installation:

      GeoIP and Geo::ipfree (Awstats 5.5+)
      GeoIP and Geo::ipfree are free of state/IP mapping tables, which are more accurate and faster than DNS-resolved domain names. The GeoIP API is free, the default library is free, and the fee is for its data update service. Geo::ipfree not only is the code public, but the library data is also public.

      GeoIP Installation:
      First download C library:GeoIP C after the package
      %./configure; Make
      #make Install

      Then download the Perl library: afterGeoIP perl Unpack
      %perl makefile.pl; Make
      #make Install

      Geo::ipfree Installation:
      After downloading geo::ipfree unpack
      %perl Makefile
      %make
      #make Install

      Configuration: By enabling GeoIP related Plug-ins in the configuration file:

      loadplugin= "GeoIP Geoip_standard/home/apache/chedong.com/cgi-bin/awstats/geoip.dat"
      loadplugin= "Geoip_city_maxmind Geoip_standard/home/apache/chedong.com/cgi-bin/awstats/geolitecity.dat"

      Maxmind currently offers free GeoIP and Geoipcitylite packets: can be downloaded from the following address on a regular basis every month

      http://www.maxmind.com/download/geoip/database/geolitecity.dat.gz
      http:// www.maxmind.com/download/geoip/database/GeoIP.dat.gz

      Contact Us

      The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

      If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

      A Free Trial That Lets You Build Big!

      Start building with 50+ products and up to 12 months usage for Elastic Compute Service

      • Sales Support

        1 on 1 presale consultation

      • After-Sales Support

        24/7 Technical Support 6 Free Tickets per Quarter Faster Response

      • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.