Use a custom PHP application to obtain the status information of the Web server. Most website hosting companies support customers' access to website statistics, but you often feel that the status information generated by the server is incomplete. For example, most website hosting companies support customers' access to website statistics. However, you often feel that the status information generated by the server is incomplete. For example, if the Web server with incorrect configuration does not recognize certain file types, these types of files will not appear in the status information. Fortunately, you can use PHP to customize the State Information Collection program so that you can get the information you need.
Structure of Common Logfile Format (CLF)
CLF was initially designed for HTTPd (global network server software) by NCSA. CERN HTTPd is a public domain Web server maintained by the World Wide Web Consortium (W3C. The W3C website lists the log file specifications. Both Microsoft and UNIX-based Web servers can generate log files in CLF format. The CLF format is as follows:
Host IdentAuthuserTime_Stamp "request" Status_codeFile_size
For example:
21.53.48.83--[22/Apr/2002: 22: 19: 12-0500] "GET/cnet.gif HTTP/1.0" 200 8237
The following is the log entry category:
Host is the IP address or DNS name of the website visitor. in the preceding example, Host is 21.53.48.83.
The Ident is the remote identity of the visitor (RFC 931 ). The dash indicates "not specified ".
Authuser is the user ID (if the Web server has verified the identity of the website visitor ).
Time_Stam is the time that the server returns in the format of "day/month/year.
A Request is an HTTP Request of a website visitor, such as GET or POST.
Status_Code is the status code returned by the server. for example, "200" indicates "correct-the browser request is successful ".
File_Size is the size of the file requested by the user. In this example, it is 8237 bytes.
Server Status Code
You can find the server status code specifications developed by W3C in the HTTP standard. The status code generated by the server indicates whether the data transmission between the browser and the server is successful. These codes are generally passed to the browser (for example, the very famous 404 error "Page Not Found") or added to the server log.
Collect data
The first step in creating a custom application is to obtain user data. Whenever you select a website resource, we want to create a corresponding log entry. Fortunately, the existence of server variables allows us to query User Browsers and obtain data.
The server variables in the header carry the information transmitted from the browser to the server. REMOTE_ADDR is an example of a server variable. This variable returns the user's IP address:
Example output: 27.234.125.222
The following PHP code displays the IP address of the current user:
Let's take a look at the code of our PHP application. First, we need to define the website resources we want to track and specify the file size:
// Obtain the name of the file we want to record
$ FileName = "cnet-banner.gif ";
$ FileSize = "92292 ";
You do not need to save these values to static variables. If you want to track many entries, you can save them to an array or database. In this case, you may want to find each entry through an external link, as shown below:
The record corresponding to "201712310400000000cnet-banner.gif. Then, we use server variables to query the user's browser. In this way, we can obtain the data required to add new entries in our log file:
// Obtain the CLF information of the website viewer
$ Host = $ _ SERVER ['remote _ ADDR '];
$ Ident = $ _ SERVER ['remote _ ident'];
$ Auth = $ _ SERVER ['remote _ user'];
$ TimeStamp = date ("d/M/Y: H: I: s O ");
$ ReqType = $ _ SERVER ['request _ method'];
$ ServProtocol = $ _ SERVER ['server _ protocol'];
$ StatusCode = "200 ";
Then, we check whether the server returns a null value ). According to the CLF specification, the null value should be replaced by a break number. In this way, the task of the next code block is to find a null value and replace it with a break number:
// Add a break number to the null value (according to the specification)
If ($ host = "") {$ host = "-";}
If ($ ident = "") {$ ident = "-";}
If ($ auth = "") {$ auth = "-";}
If ($ reqType = "") {$ reqType = "-";}
If ($ servProtocol = "") {$ servProtocol = "-";}
Once necessary information is obtained, these values are organized into a CLF compliant format:
// Create a string in CLF format
$ ClfString = $ host. "". $ ident. "". $ auth. "[". $ timeStamp. "] \" ". $ reqType. "/". $ fileName. "". $ servProtocol. "\"". $ statusCode. "". $ fileSize. "\ r \ n ";
Create a custom log file
Now, formatted data can be stored in our custom log files. First, we will create a file naming convention and compile a daily method (function) for generating a new log file ). In the example given in this article, each file starts with "weblog-" and is then a date represented by month/day/year. the file extension is. log .. The log extension generally indicates the server log file. (In fact, most log Analyzer searches for. log files .)
// Use the current date to name the log file
$ LogPath = "./log /";
$ LogFile = $ logPath. "weblog-". date ("mdy"). ". log ";
Now, we need to determine whether the current log file exists. If it exists, we will add entries to it; otherwise, the application will create a new log file. (The new log file is generally created when the date is changed because the file name changes .)
// Check whether the log file already exists
If (file_exists ($ logFile )){
// If yes, open the existing log file
$ FileWrite = fopen ($ logFile, "");}
Else {
// Otherwise, create a new log file
$ FileWrite = fopen ($ logFile, "w ");}
If you receive the "Permission Denied" error message when writing or appending a file, change the Permission of the target log folder to allow write operations. The default permission of most Web servers is "readable and executable ". You can use the CHMOD command or the FTP client to change the folder permissions.
Then, we create a file lock mechanism so that when two or more users access the log file at the same time, only one of them can write the file:
// Create a file write lock mechanism
Flock ($ fileWrite, LOCK_SH );
Finally, we write the content of the entry:
// Write CLF entries
Fwrite ($ fileWrite, $ clfString );
// Unlock the file
Flock ($ fileWrite, LOCK_UN );
// Close the log file
Fclose ($ fileWrite );
Process log data
After the system is productized, the customer wants to obtain a detailed statistical analysis of the collected visitor data. Since all custom log files are organized in a standard format, any log analyzer can process them. A log analyzer is a tool that analyzes large log files and generates pie charts, histograms, and other statistical graphs. Log Analyzer is also used to collect data and provide information on which users access your website and the number of clicks.
The following lists several popular Log analyzers:
WebTrends is a very good Log Analyzer, which is suitable for large-scale websites and enterprise-level networks.
Analog is a popular free Log Analyzer.
Webalizer is a free analysis program. It can generate HTML reports so that most web browsers can view their reports.
Compliance with standards
We can easily extend this application to support other types of logging. In this way, you can capture more data, such as the browser type and referrer (referrer refers to the previous webpage that is linked to the current webpage ). The experience here is that following the standards or conventions during your programming will eventually simplify your work.
Web hosting supports customers' access to website statistics. However, you often feel that the status information generated by the server is incomplete. For example, configure...