Apache log interpretation, Apache log what each column represents?

Source: Internet
Author: User
Tags apache error log documentation apache access log apache log nslookup tool

Apche Log Series (1): Access log

Want to know when and what people are browsing the content of the site. See Apache's access log to know. The access log is the standard log for Apache, which explains in detail the contents of the Access log and the configuration of related options.

Format of Access log

Apache has built the ability to record server activity, which is its log function. This "Apache Log" series describes the Apache access log, error log, and how to analyze log data, how to customize the Apache log, how to generate statistical reports from the log data and so on.

If Apache is installed by default, two log files will be generated as soon as the server runs. These two files are Access_log (Access.log on Windows) and Error_log (Error.log on Windows). When the default installation method is used, these files can be found under/usr/local/apache/logs, and for Windows systems, these log files will be saved in the logs subdirectory of the Apache installation directory. Different package managers put log files in a variety of locations, so you might want to find somewhere else, or look through the configuration file to see where the log files are configured.

As its name shows, the access log Access_log records all Access activities to the Web server. The following is a typical record in the access log:

216.35.116.91--[19/aug/2000:14:47:37-0400] "get/http/1.0″200 654

This line consists of 7 items, with two blanks in the example above, but the entire line is still divided into 7 items.

The first information is the address of the remote host, which indicates who is visiting the site. In the example above, the host accessing the Web site is 216.35.116.91. Incidentally, this address belongs to a machine called si3001.inktomi.com (to find this information, you can use the Nslookup tool to find DNS), Inktomi.com is a company that makes web search software. As you can see, we can get a lot of information about our visitors just from the first of the log records.

By default, the first message is only the IP address of the remote host, but we can ask Apache to identify all the host names and replace the IP address with the host name in the log file. However, this practice is often not recommended because it will greatly affect the speed at which the server logs logs, thereby reducing the efficiency of the entire site. In addition, there are many tools to convert the IP address in the log file to the host name, so it is not worth the Apache record host name to replace the IP address.

However, if it is really necessary for Apache to find the name of the remote host, then we can use the following directive:

Hostnamelookups on

If Hostnamelookups is set to double instead of on, the logger will reverse lookup the host name it finds, verifying that the host name does point to the IP address that originally appeared. Hostnamelookups is set to off by default.

The second item in the previous example log record is blank, replaced with a "-" placeholder. In fact, this is true most of the time. This position is used to record the visitor's identity, not just the visitor's login name, but the visitor's email address or other unique identifier. This information is returned by IDENTD or returned directly by the browser. Early in the day, Netscape 0.9 also dominated, a location that often records the visitor's email address. However, because someone used it to collect e-mail addresses and send spam, it didn't last long, and almost all browsers on the market canceled this feature long ago. So today, the second item we see in the log is that the opportunity for an email address is slim.

The third item in the log record is also blank. This location is used to record the name provided by the viewer for authentication. Of course, this information is not blank if some content of the site requires the user to authenticate. However, for most Web sites, this one is still blank in most records of log files.

The fourth item in the log record is the requested time. This information is enclosed in square brackets and is in the so-called "public log format" or "Standard English format". Therefore, the last instance log record indicates the requested time is August 19, 2000 14:47:37 Wednesday. The last "0400" of the time information indicates that the time zone in which the server is located is 4 hours before UTC.

The fifth item of logging may be the most useful information in the entire log, which tells us what kind of request the server receives. The typical format for this information is "method RESOURCE PROTOCOL", or "approach resource protocol."

In the example above, method is get, and other frequently possible method is post and head. In addition, there are a number of possible legal method, but mainly these three kinds.

Resource refers to a document, or URL, that a browser requests to the server. In this case, the viewer requests "/", that is, the home page or root of the site. In most cases, "/" points to the index.html document in the DocumentRoot directory, but it may also point to other files depending on the server configuration.

Protocol is usually HTTP, followed by the version number. The version number is either 1.0 or 1.1, but there are more than 1.0. We know that the HTTP protocol is the foundation on which the web works, http/1.0 is an earlier version of the HTTP protocol, and 1.1 is the most recent version. Most current Web client programs still use the 1.0 version of the HTTP protocol.

The sixth item of the log record is the status code. It tells us if the request was successful, or what kind of mistake it was having. Most of the time, this value is 200, which indicates that the server has successfully responded to the browser's request and everything is OK. It is not intended to give a complete list of status codes and to explain their implications, please refer to the relevant information for this information. But generally speaking, a status code that begins with 2 indicates success, and a status code that begins with 3 indicates that the user request was redirected to another location for various reasons, and the status code beginning with 4 indicates that the client has some sort of error, and a status code beginning with 5 indicates that the server encountered an error.

Item seventh of the log record represents the total number of bytes sent to the client. It tells us if the transmission is interrupted (that is, whether the value is the same size as the file). Adding these values to the log records will tell you how much data the server sends in a day, week, or month.

Second, configure access log

The location of the access log file is actually a configuration option. If we examine the httpd.conf configuration file, we can see that the following line is in the file:

Customlog/usr/local/apache/logs/access_log Common

Note that this line may be slightly different for an earlier Apache server. It may not use the customlog instruction, but the transferlog instruction. If your server falls into this category, it is recommended that you upgrade the server as soon as possible.

The customlog directive specifies the location in which to save the log file and the format of the log. As for how to customize the format and content of the log file, we will discuss it later in this "Apache log" series of articles. This line of instructions specifies the common log format, and the common format is its standard format since the Web server started. As a result, we can also understand that while almost no more client programs provide user identity information to the server, access logs retain the second item.

The path in the Customlog directive is the path to the log file. Note that because the log file is opened by an HTTP user (specified with the user directive), it is important to note that the path has security guarantees to prevent the file from being overwritten arbitrarily.

The following sections of the Apache Log series will continue to describe the Apache error log, the format and content of the custom log, how to write the log contents to the specified program instead of the file, how to get some very useful statistics from the log file, and so on.

Apche Log Series (2): Error log

The error log, like the access log, is also the standard log for Apache. This article analyzes the contents of the error log, describes how to set options related to error logging, the classification of document errors and CGI errors, and how to easily view the contents of the log, and so on.

I. Location and content

The preceding article discusses the Apache access log, including its content, format, and how to set options for accessing the log. In this article we are going to discuss another Apache standard log-error log.

The error log differs from the access log in both format and content. However, the error log and the access log provide a wealth of information that we can use to analyze the server's operation and where the problem occurs.

The file name of the error log is error_log, but if it is a Windows platform, the file name of the error log is error.log. The location of the error log can be set through the errorlog directive:

ErrorLog Logs/error.log

This file location is relative to the ServerRoot directory unless the file location starts with "/". If Apache is installed by default installation, the location of the error log should be under/usr/local/apache/logs. However, if Apache is installed with some sort of package manager, the error log is likely to be in another location.

As its name shows, the error log records the various errors encountered during the server's run, as well as some common diagnostic information, such as when the server starts, when it shuts down, and so on.

We can set the level of log file logging information, control the number and type of log file records information. This is set by the LOGLEVEL directive, which defaults to the level of error, which is to record an event that is called an error. For a complete list of the various options that are allowed in this directive, see the Apache documentation for Http://www.apache.org/docs/mod/core.html#loglevel.

In most cases, what we see in the log file is divided into two categories: document errors and CGI errors. However, there are occasional configuration errors in the error log, as well as the previously mentioned server startup and shutdown information.

Second, document error

The document error corresponds to the 400 series code in the server answer, and the most common is the 404 error--document not Found (document not found). In addition to the 404 error, user authentication errors are a common error.

The 404 error occurs when the resource requested by the user (that is, the URL) does not exist, either because the user has entered a URL error, or because the document that originally existed on the server has been deleted or moved.

By the way, according to Jakob Nielson, we should never move or delete any of the Web site's resources without providing redirects or other remedial measures. For more articles Nielson, see http://www.zdnet.com/devhead/alertbox/.

When a user cannot open a document on the server, the record that appears in the error log looks like this:

[Fri Aug 18 22:36:26 2000] [ERROR]

[Client 192.168.1.6] File does not exist:

/usr/local/apache/bugletdocs/img/south-korea.gif

As you can see, error logging is divided into multiple items, just as the log access_log file is accessed.

Error records begin with date/time markers, noting that their format differs from the date/time format in Access_log. The format in Access_log is called the "Standard English format", which may be a joke of history, but it's too late to change it now.

The second item in the error record is the level of the current record, which indicates the severity of the problem. This level of information may be any of the levels listed in the documentation for the LOGLEVEL directive (see the link to the previous loglevel), and the error level is between the warn level and the crit level. 404 is the error level, which indicates that a problem is actually encountered, but the server can also run.

The third item in the error record indicates the IP address used by the user when making the request.

The last item in the record is the real error message. For the 404 error, it also gives a full path indicating the file the server is trying to access. This information is useful when we anticipate that a file should be in the target location with 404 errors. The reason for this error is often the server configuration error, the actual virtual host of the file is different from what we expected, or some other unexpected situation.

The error records that occur because of user authentication issues are as follows:

[Tue APR 11 22:13:21 2000]

[ERROR] [Client 192.168.1.3] User Rbowen@rcbowen.

Com:authentication failure for "/cgi-bin/hirecareers/company.cgi":

Password mismatch

Note that because document errors are a direct result of user requests, they also have records in the access log.

Third, CGI errors
Perhaps the main purpose of the error log is to diagnose a CGI program that behaves abnormally. For further analysis and processing convenience, all content of the CGI program output to STDERR (Standard error, standard faulty device) will go directly into the error log. This means that any well-written CGI program, if there is a problem, the error log will tell us more about the problem.

However, there are drawbacks to outputting the CGI program error to the error log, and there will be many things in the error log that are not well-formed, making it difficult to analyze useful information from the error Log automated analyzer.

The following is an example of an error record that appears in the error log when debugging Perl CGI code:

[Wed June 14 16:16:37 2000] [ERROR] [Client 192.168.1.3] Premature

End of script headers:/usr/local/apache/cgi-bin/hypercalpro/announcement.cgi

Global symbol "$RV" requires explicit package name at

/USR/LOCAL/APACHE/CGI-BIN/HYPERCALPRO/ANNOUNCEMENT.CGI Line 81.

Global symbol "%details" requires explicit package name at

/USR/LOCAL/APACHE/CGI-BIN/HYPERCALPRO/ANNOUNCEMENT.CGI Line 84.

Global symbol "$Config" requires explicit package name at

/USR/LOCAL/APACHE/CGI-BIN/HYPERCALPRO/ANNOUNCEMENT.CGI Line 133.

Execution of/usr/local/apache/cgi-bin/hypercalpro/announcement.cgi

Aborted due to compilation errors.

As you can see, the CGI error is the same as the previous 404 error format, including date/time, error level, and customer address, error message. But there are several lines of error messages for this CGI error, which often interfere with the work of some error log analysis software.

With this error message, even people who are unfamiliar with Perl can find a lot of information about the error, such as at least being able to easily tell which lines of code are having problems. Perl's mechanism for reporting bugs is quite perfect. Of course, the information that is output from different programming languages to the error log is different.

Because of the particularity of the CGI program environment, most CGI program errors will be difficult to solve without the help of the error log.

A lot of people in the mailing list or newsgroup complained that they have a CGI program, when the page was opened when the server returned an error, such as "Internal Server error." We can be sure that these people have not seen the error log of the server or that the error log exists at all. In most cases, the error log is able to pinpoint exactly where the CGI error is and how to fix the error.

Four, view the log file

I used to tell people that while I was developing, I was constantly checking the logs of the server so that I could know immediately what was wrong. But the answer I get is often silence. At first I thought that this silence meant "of course you had to do it", and later I realized that the real meaning of this silence was "I don't know what others are doing, but I don't do it myself." ”

Even so, let's look at how to easily view the server log files. Connect to the server with Telnet, and enter the following command:

Tail-f/usr/local/apache/logs/error_log

The command will display the last few lines of the log file, and if new content is added to the log file, it will immediately display the newly added content.

Windows users can also use this approach, such as using a variety of UNIX tools packages for Windows. I personally love a tool called Aintx, which can be found in http://maxx.mc.net/~jlh/nttools/index.htm.

An alternative approach is to use the following Perl code, which utilizes a module called File::tail:

Use File::tail;

$file =file::tail->new ("/some/log/file");

while (defined ($line = $file->read)) {

print "$line";

}

Regardless of which method is used, it is a good practice to open multiple terminal windows at the same time: for example, the error log is displayed in one window and the access log is displayed in another window. In this way, we can keep abreast of what is happening on the site and resolve it immediately.

In the next article in the Apache Log series, we'll talk about customizing the server log, which is how to record all the information we want in a log file, and eliminate all the information we don't want.

After that, we will also discuss the processing of log files, that is, how to generate a statistical report from a log file. In the last few articles, we will also discuss how to redirect logging to a specified program instead of saving it to a log file so that the program can process the newly generated log data in real time, such as saving the log data to the database, or sending the log information to the system administrator by email when some critical error occurs. Wait a minute.

Apche Log Series (3): Custom log

Sometimes we need to customize the format and content of the Apache default log, such as increasing or decreasing the information logged by the log, changing the format of the default log file, and so on. This article describes all the information you can use with logging, and how to set up Apache to record this information.

First, define the log format (April 3)

Long ago, log files had only one format, which was "public format," which many people had become accustomed to using. The custom log format then appears, and the custom log format appears to be more popular, even though the public log format itself is redefined with custom log format. This article is about how to customize the format of log files, and how to let log files to record the information you want.

The format of the custom log file involves two instructions, the Logformat directive and the customlog instruction, and the default httpd.conf file provides several examples of these two directives.

The LOGFORMAT directive defines the format and assigns a name to the format, which we can then refer to directly. The customlog instruction sets the log file and indicates the format in which the log file is used (usually by the name of the format).

The function of the Logformat directive is to define the log format and specify a name for it. For example, in the default httpd.conf file, we can find the following line of code:

Logformat "%h%l%u%t \"%r\ "%>s%b" common

This directive creates a log format called "Common", which is specified in double quotes. Each variable in the format string represents a specific piece of information that is written to the log file in the order specified by the format string.

The Apache document has given all the variables that can be used for the format string and its meaning, as follows:

———————————————————————-

%...A: Remote IP Address

%... A: Local IP address

%... B: Number of bytes sent, no HTTP headers included

The number of sent bytes in the%...B:CLF format and does not contain HTTP headers.

For example, when no data is sent, write '-' instead of 0.

%e: Content of environment variable Foobar

%...F: File name

%...H: Remote Host

%... The protocol requested by H

%i:foobar, the header line of the request sent to the server.

%...L: Telnet name (from Identd, if provided)

Method of%...M Request

%n: The contents of the annotation "Foobar" from another module

%o:foobar, the header line of the answer

%...P: Port used when the server responds to requests

%... P: The child process ID that responds to the request.

%...Q query string (if there is a query string, contains "?" Behind the

part; otherwise, it is an empty string. )

%...R: The first line of the request

%...S: State. For internal redirection requests, this means * original * request

's state. If you use%...>s, you refer to subsequent requests.

%...T: The time represented in the public log Time format (or Standard English

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.