Manually convert Nginx logs
Nginx logs are an undiscovered treasure for most people. Sum up previous experiences in a log analysis system and share with you the manual analysis method of Nginx logs.
Nginx log configurations include access_log and log_format.
Default format:
access_log /data/logs/nginx-access.log;
log_format old '$remote_addr [$time_local] $status $request_time $body_bytes_sent '
'"$request" "$http_referer" "$http_user_agent"';
I believe most people who have used Nginx are familiar with the default Nginx log format configuration and log Content. However, the default configuration and format are readable but difficult to calculate.
You can configure the following policies for flushing Nginx logs:
For example, set buffer, buffer 32 k to fl; if the buffer is less than the 5S clock forced fl disk configuration is as follows:
access_log /data/logs/nginx-access.log buffer=32k flush=5s;
This determines whether to view logs in real time and the effect of logs on disk IO.
Many variables that can be recorded in Nginx logs are not found in the default configuration:
For example:
- Request data size: $ request_length
- Returned data size: $ bytes_sent
- Request time: $ request_time
- Connection number used: $ connection
- Number of requests for the current connection: $ connection_requests
The default format of Nginx is not computable. You need to convert it to A computable format. For example, use the control character ^ a (ctrl + v ctrl + A on Mac) to separate each field.
The format of log_format can be changed to the following:
log_format new'$remote_addr^A$http_x_forwarded_for^A$host^A$time_local^A$status^A'
'$request_time^A$request_length^A$bytes_sent^A$http_referer^A$request^A$http_user_agent';
In this way, common Linux Command Line tools are used for analysis:
- Find the URL and times with the highest Access frequency:
cat access.log | awk -F ‘^A’‘{print $10}’| sort | uniq -c
- Find the access to the current log file with Error 500:
cat access.log | awk -F ‘^A’‘{if($5 ==500)print $0}’
- Find the number of 500 errors in the current log file:
cat access.log | awk -F ‘^A’‘{if($5 ==500)print $0}’| wc -l
- Find the number of 500 Error visits in one minute:
cat access.log | awk -F ‘^A’‘{if($5 ==500)print $0}’| grep ’09:00’| wc-l
- Find slow requests that consume more than 1 s:
tail -f access.log | awk -F ‘^A’‘{if($6>1)print $0}’
- Suppose you only want to view some digits:
tail -f access.log | awk -F ‘^A’‘{if($6>1)print $3″|”$4}’
- Find the URL with the most 502 errors:
cat access.log | awk -F ‘^A’‘{if($5==502)print $11}’| sort | uniq -c
- Search for the 200 blank page
cat access.log | awk -F ‘^A’‘{if($5==200&& $8 <100)print $3″|”$4″|”$11″|”$6}’
- View real-time log data streams
tail -f access.log | cat -e
Or
tail -f access.log | tr ‘^A’‘|’
Summary
According to this idea, we can perform many other analyses, such as the most frequently accessed UA, the most frequently accessed IP, the analysis of request time consumption, and the analysis of the returned packet size.
This is the prototype of a large Web log analysis system. This format is also very convenient for subsequent large-scale batching and streaming computing.
Deployment of Nginx + MySQL + PHP in CentOS 6.2
Build a WEB server using Nginx
Build a Web server based on Linux6.3 + Nginx1.2 + PHP5 + MySQL5.5
Performance Tuning for Nginx in CentOS 6.3
Configure Nginx to load the ngx_pagespeed module in CentOS 6.3
Install and configure Nginx + Pcre + php-fpm in CentOS 6.4
Nginx installation and configuration instructions
Nginx log filtering using ngx_log_if does not record specific logs
Nginx details: click here
Nginx: click here
This article permanently updates the link address: