Nginx Log for most people is an untapped treasure, summed up before doing a log analysis system experience, and everyone to share the Nginx log Pure Manual analysis method.
Nginx Log related configuration has 2 places: Access_log and Log_format.
The default format:
/data/logs/nginx-access. Log;
"$request" "$http _referer" "$http _user_agent";
I believe most people who have used Nginx are familiar with the default Nginx log format configuration and are familiar with the contents of the log. The default configuration and format, though readable, are difficult to calculate.
Nginx Log brush disk related policies can be configured:
For example, set the Buffer,buffer full 32k to brush the disk, if buffer is not satisfied 5s clock forced brush disk configuration is as follows:
Access_log/data/logs/nginx-access.log buffer=32k flush=5s;
This determines whether the log is seen in real time and the impact of the log on disk IO.
There are a number of variables that can be logged by the Nginx log that do not appear in the default configuration:
Like what:
Request data size: $request _length
Return data size: $bytes _sent
Request time: $request _time
Connection number used: $connection
Number of requests for current connection: $connection _requests
Nginx default format is not calculated, you need to find a way to convert to a computable format, such as the control character ^a (Mac Ctrl + a CTRL + a) split each field.
The Log_format format can be changed to this:
' $remote _addr^a$http_x_forwarded_for^a$host^a$time_local^a$status^a ' ' $request _time^a$request_length^a$bytes_sent^a$http_referer^a$request^a$http_user_agent ';
This is followed by a common Linux command-line tool for analysis:
1. Find the most frequently accessed URLs and times:
Cat awk -F ' ^a ' {print $tensortuniq -C
2. Find the current log file 500 error access:
Cat awk ' ^a ' ' {if ($ = =) Print $} '
3. Find the number of current log file 500 errors:
Cat awk ' ^a ' ' {if ($ = =) Print $} ' WC -L
4. Find the number of 500 error accesses in a minute:
Cat awk ' ^a ' ' {if ($ = =) Print $} ' grep ' - ' WC-L
5. Find a slow request that takes more than 1s:
Tail awk ' ^a ' ' {if ($6>1) print $} '
If you want to see only certain bits:
Tail awk ' ^a ' ' {if ($6>1) print $ "|" $4}'
6. Find the URL with the most error 502:
Cat awk ' ^a ' ' {if ($5==502) print $11} ' Sort Uniq -C
7. Find 200 blank pages
Cat awk ' ^a ' ' {if ($5==200 && $8 <) print $ "|" $4 "|" $11 "|" $6}'
8. View real-time log data streams
Tail Cat Tail TR ' ^a ' ' | '
9. Statistics the most visited IP
Tail 10000 awk ' {print $} '| Sort| Uniq -c| Sort -rn| Head - More
10. View the most visited time of the day through the log
Tail + awk ' {print $4} '| Cut | Sort| Uniq -c| Sort -rn| Head -Ten| More
Summarize
There are many other analyses that can be done in this way, such as the maximum UA access, the most frequently accessed IP, the request time-consuming analysis, the request return packet size analysis, and so on.
This is a prototype of a large WEB log analysis system, which is also very handy for subsequent large-scale batching and streaming computations.
Link:http://blog.eood.cn/nginx_logs
Think very early before zone someone asked Nginx log analysis, just today see this article, own practice under, feel good, share under:)
Nginx log Sharing