Apache Log format and configuration

Source: Internet
Author: User
Tags apache error log apache access log apache log nslookup tool
Apache Log Configuration

Sometimes we need to customize the format and content of the Apache default log, such as increasing or decreasing the information logged by the log, changing the format of the default log file, and so on. This article describes all the information you can use with logging, and how to set up Apache to record this information.

First, define the log format (April 3)

Long ago, log files had only one format, which was "public format," which many people had become accustomed to using. The custom log format then appears, and the custom log format appears to be more popular, even though the public log format itself is redefined with custom log format. This article is about how to customize the format of log files, and how to let log files to record the information you want.

The format of the custom log file involves two instructions, the Logformat directive and the customlog instruction, and the default httpd.conf file provides several examples of these two directives.

The LOGFORMAT directive defines the format and assigns a name to the format, which we can then refer to directly. The customlog instruction sets the log file and indicates the format in which the log file is used (usually by the name of the format).

The function of the Logformat directive is to define the log format and specify a name for it. For example, in the default httpd.conf file, we can find the following line of code:

Logformat "%h%l%u%t \"%r\ "%>s%b" common

This directive creates a log format called "Common", which is specified in double quotes. Each variable in the format string represents a specific piece of information that is written to the log file in the order specified by the format string.

The Apache document has given all the variables that can be used for the format string and its meaning, as follows:

%...A: Remote IP Address

%... A: Local IP address

%... B: Number of bytes sent, no HTTP headers included

The number of sent bytes in the%...B:CLF format and does not contain HTTP headers. For example, when no data is sent, write '-' instead of 0.

%... {foobar}e: Content of environment variable FOOBAR

%...F: File name

%...H: Remote Host

%... The protocol requested by H

%... {Foobar}i:foobar, the header line of the request sent to the server.

%...L: Telnet name (from Identd, if provided)

Method of%...M Request

%... {foobar}n: The contents of the annotation ' Foobar ' from another module

%... {Foobar}o:foobar contents, header line of answer

%...P: Port used when the server responds to requests

%... P: The child process ID that responds to the request.

%...Q query string (if there is a query string, contains "?" The remainder of the section, otherwise it is an empty string. )

%...R: The first line of the request

%...S: State. For internal redirection requests, this refers to the status of the * original * request. If you use%...>s, you refer to subsequent requests.

%...T: Time represented in the public log Time format (or Standard English format)

%... {format}t: Time represented in the specified format

%... T: The time spent in response to a request, in seconds

%...U: Remote user (from Auth; if the return status (%s) is 401, it may be forged)

%... U: URL path requested by the user

%...V: ServerName In response to requested server

%... V: Server name according to Usecanonicalname settings

In all of the variables listed above, "..." indicates an optional condition. If you do not specify a condition, the value of the variable is replaced with "-". Analyzing the previous example of the Logformat directive from the default httpd.conf file, you can see that it creates a log format called "Common," which includes: remote host, Telnet name, remote user, request time, first line of request code, request status, and number of bytes sent.

Sometimes we just want to record some specific, defined information in the log, and then we need to use "...". If one or more HTTP status codes are placed between "%" and a variable, the content represented by the variable is recorded only if the status code returned by the request belongs to one of the specified status codes. For example, if we want to record all the invalid links for a Web site, we can use:

Logformat%404{referer}i Brokenlinks

Conversely, if we want to record the request that the status code is not equal to the specified value, simply add a "!" Symbols can be:

Apache log: Access log (i)

Want to know when and what people are browsing the content of the site. See Apache's access log to know. The access log is the standard log for Apache, which explains in detail the contents of the Access log and the configuration of related options.

Format of Access log

Apache has built the ability to record server activity, which is its log function. This "Apache Log" series describes the Apache access log, error log, and how to analyze log data, how to customize the Apache log, how to generate statistical reports from the log data and so on.

If Apache is installed by default, two log files will be generated as soon as the server runs. These two files are Access_log (Access.log on Windows) and Error_log (Error.log on Windows). When the default installation method is used, these files can be found under/usr/local/apache/logs, and for Windows systems, these log files will be saved in the logs subdirectory of the Apache installation directory. Different package managers put log files in a variety of locations, so you might want to find somewhere else, or look through the configuration file to see where the log files are configured.

As its name shows, the access log Access_log records all Access activities to the Web server. The following is a typical record in the access log:[19/aug/2000:14:47:37-0400] "get/http/1.0" 200 654

This line consists of 7 items, with two blanks in the example above, but the entire line is still divided into 7 items.

The first information is the address of the remote host, which indicates who is visiting the site. In the example above, the host accessing the Web site is Incidentally, this address belongs to a machine called si3001.inktomi.com (to find this information, you can use the Nslookup tool to find DNS), Inktomi.com is a company that makes web search software. As you can see, we can get a lot of information about our visitors just from the first of the log records.

By default, the first message is only the IP address of the remote host, but we can ask Apache to identify all the host names and replace the IP address with the host name in the log file. However, this practice is often not recommended because it will greatly affect the speed at which the server logs logs, thereby reducing the efficiency of the entire site. In addition, there are many tools to convert the IP address in the log file to the host name, so it is not worth the Apache record host name to replace the IP address.

However, if it is really necessary for Apache to find the name of the remote host, then we can use the following directive:

Hostnamelookups on

If Hostnamelookups is set to double instead of on, the logger will reverse lookup the host name it finds, verifying that the host name does point to the IP address that originally appeared. Hostnamelookups is set to off by default.

The second item in the previous example log record is blank, replaced with a "-" placeholder. In fact, this is true most of the time. This position is used to record the visitor's identity, not just the visitor's login name, but the visitor's email address or other unique identifier. This information is returned by IDENTD or returned directly by the browser. Early in the day, Netscape 0.9 also dominated, a location that often records the visitor's email address. However, because someone used it to collect e-mail addresses and send spam, it didn't last long, and almost all browsers on the market canceled this feature long ago. So today, the second item we see in the log is that the opportunity for an email address is slim.

The third item in the log record is also blank. This location is used to record the name provided by the viewer for authentication. Of course, this information is not blank if some content of the site requires the user to authenticate. However, for most Web sites, this one is still blank in most records of log files.

The fourth item in the log record is the requested time. This information is enclosed in square brackets and is in the so-called "public log format" or "Standard English format". Therefore, the last instance log record indicates the requested time is August 19, 2000 14:47:37 Wednesday. The last "0400" of the time information indicates that the time zone in which the server is located is 4 hours before UTC.

The fifth item of logging may be the most useful information in the entire log, which tells us what kind of request the server receives. The typical format for this information is "method RESOURCE PROTOCOL", or "approach resource protocol."

In the example above, method is get, and other frequently possible method is post and head. In addition, there are a number of possible legal method, but mainly these three kinds.

Resource refers to a document, or URL, that a browser requests to the server. In this case, the viewer requests "/", that is, the home page or root of the site. In most cases, "/" points to the index.html document in the DocumentRoot directory, but it may also point to other files depending on the server configuration.

Protocol is usually HTTP, followed by the version number. The version number is either 1.0 or 1.1, but there are more than 1.0. We know that the HTTP protocol is the foundation on which the web works, http/1.0 is an earlier version of the HTTP protocol, and 1.1 is the most recent version. Most current Web client programs still use the 1.0 version of the HTTP protocol.

The sixth item of the log record is the status code. It tells us if the request was successful, or what kind of mistake it was having. Most of the time, this value is 200, which indicates that the server has successfully responded to the browser's request and everything is OK. It is not intended to give a complete list of status codes and to explain their implications, please refer to the relevant information for this information. But generally speaking, a status code that begins with 2 indicates success, and a status code that begins with 3 indicates that the user request was redirected to another location for various reasons, and the status code beginning with 4 indicates that the client has some sort of error, and a status code beginning with 5 indicates that the server encountered an error.

Item seventh of the log record represents the total number of bytes sent to the client. It tells us if the transmission is interrupted (that is, whether the value is the same size as the file). Adding these values to the log records will tell you how much data the server sends in a day, week, or month.

Second, configure access log

The location of the access log file is actually a configuration option. If we examine the httpd.conf configuration file, we can see that the following line is in the file:

Customlog/usr/local/apache/logs/access_log Common

Note that this line may be slightly different for an earlier Apache server. It may not use the customlog instruction, but the transferlog instruction. If your server falls into this category, it is recommended that you upgrade the server as soon as possible.

The customlog directive specifies the location in which to save the log file and the format of the log. As for how to customize the format and content of the log file, we will discuss it later in this "Apache log" series of articles. This line of instructions specifies the common log format, and the common format is its standard format since the Web server started. As a result, we can also understand that while almost no more client programs provide user identity information to the server, access logs retain the second item.

The path in the Customlog directive is the path to the log file. Note that because the log file is opened by an HTTP user (specified with the user directive), it is important to note that the path has security guarantees to prevent the file from being overwritten arbitrarily.

The following sections of the Apache Log series will continue to describe the Apache error log, the format and content of the custom log, how to write the log contents to the specified program instead of the file, how to get some very useful statistics from the log file, and so on.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.