Linux Log Analysis Script

Source: Internet
Author: User
Tags sorts alphanumeric characters apache log

# # # # # # # # # # # # # # # # # #, change the parameters according to negatives Modified according to Apache log format

View Apache Process
PS aux | grep httpd | Grep-v grep | Wc-l

View TCP connections on port 80
Netstat-tan | grep "established" | grep ": 80" | Wc-l

Sniff 80-port access with tcpdump to see who is the tallest
Tcpdump-i ETH0-TNN DST Port 80-c 1000 | Awk-f "." ' {print $ '. $ "." $ "." $4} ' | Sort | uniq-c | Sort-nr

View IP connections for a time period
grep "2014:0[7-8]" Www20110519.log | awk ' {print $} ' | Sort | uniq-c | Sort-nr | Wc-l

20 most joined IP addresses in the current WEB server
Netstat-ntu | awk ' {print $} ' | Sort | uniq-c | Sort-n-R | Head-n 20

View the top 10 most visited IPs in a log
awk ' {print '} ' Access.log | Sort | uniq-c | Sort-nr | Head-n 10

View the most recently accessed files
Cat Access_log |tail-10000|awk ' {print $7} ' |sort|uniq-c|sort-nr|less

List files that have been transmitted for longer than 30 seconds
Cat Access_log|awk ' ($NF >) {print $7} ' |sort-n|uniq-c|sort-nr|head-20

List the most time-consuming pages (more than 60 seconds) and the number of corresponding page occurrences
Cat Access_log |awk ' ($NF > && $7~//.php/) {print $7} ' |sort-n|uniq-c|sort-nr|head-100

Find 20 IPs with the most requests (often used to find the source of the attack)
Netstat-anlp|grep 80|grep Tcp|awk ' {print $} ' |awk-f: ' {print '} ' |sort|uniq-c|sort-nr|head-n20

Sniff 80-port access with tcpdump to see who is the tallest
Tcpdump-i ETH0-TNN DST Port 80-c 1000 | Awk-f "." ' {print $ '. $ "." $ "." $4} ' | Sort | uniq-c | Sort-nr |head-20

Find more time_wait connections
Netstat-n|grep Time_wait|awk ' {print $} ' |sort|uniq-c|sort-rn|head-n20

Find more SYN connections
Netstat-an | grep SYN | awk ' {print $} ' | Awk-f: ' {print $} ' | Sort | uniq-c | Sort-nr | More

Analyze the top 20 URLs for the next 2012-05-04 access page in the log file and sort
Cat Access.log |grep ' 04/may/2012′| awk ' {print $11} ' |sort|uniq-c|sort-nr|head-20

Query the URL address of the visited page that contains the IP address of the URL
Cat Access_log | awk ' ($11~/\ {print '} ' |sort|uniq-c|sort-nr

Most visited files or pages, take the top 20 and count all Access IP
Cat Access.log|awk ' {print $11} ' |sort|uniq-c|sort-nr|head-20
awk ' {print '} ' access.log |sort-n-R |uniq-c|wc-l

In the query log for the time period of the situation
Cat Wangsu.log | Egrep ' 06/sep/2012:14:35|06/sep/2012:15:05′|awk ' {print $} ' |sort|uniq-c|sort-nr|head-10

List the maximum number of EXE files transmitted (used when analyzing the download station)
Cat Access.log |awk ' ($7~/\.exe/) {print $ "" $ "" $4 "" $7} ' |sort-nr|head-20

Lists EXE files with output greater than 200000byte (approx. 200kb) and the number of corresponding file occurrences
Cat Access.log |awk ' ($10> 200000 && $7~/\.exe/) {print $7} ' |sort-n|uniq-c|sort-nr|head-100

If the last column of the log records the paging file transfer time, there are the most time-consuming pages listed to the client
Cat Access.log |awk ' ($7~/\.php/) {print $NF "" $ "" $4 "" $7} ' |sort-nr|head-100

List the most time-consuming pages (more than 60 seconds) and the number of corresponding page occurrences
Cat Access.log |awk ' ($NF > && $7~/\.php/) {print $7} ' |sort-n|uniq-c|sort-nr|head-100

List files that have been transmitted for longer than 30 seconds
Cat Access.log |awk ' ($NF >) {print $7} ' |sort-n|uniq-c|sort-nr|head-20

Statistics website Traffic (G)
Cat Access.log |awk ' {sum+=$10} END {print sum/1024/1024/1024} '

Statistics 404 of Connections
awk ' ($9 ~/404/) ' Access.log | awk ' {print $9,$7} ' | Sort

Statistics HTTP Status
Cat Access.log |awk ' {print $9} ' |sort|uniq-c|sort-rn

Concurrency per Second: (Show Top ten)
Cat Access.log | awk ' {if ($9~/200|206|404/) count[$4]++}end{for (A in COUNT) print A,count[a]} ' |sort-k 2-nr|head-n 10

Bandwidth statistics
Cat Access.log |awk ' {if ($7~/get/) count++}end{print "client_request=" Count} '
Cat Access.log |awk ' {byte+=$11}end{print "client_kbyte_out=" byte/1024 "KB"} '

What IP connections with the highest number of IPs are doing today:
Cat Access.log | grep "" | awk ' {print $11} ' | Sort | uniq-c | Sort-nr | Head-n 10

To view the number of concurrent requests for Apache and its TCP connection status:
Netstat-n | awk '/^tcp/{++s[$NF]} END {for (a in S) print A, s[a]} '

Other view TCP connection status:
Netstat-nat |awk ' {print $6} ' |sort|uniq-c|sort-rn

The sort command sorts all the rows in the specified file line by row and displays the results on standard output. If you do not specify a file or use "one"
Represents a file, the sort content is from standard input. The sort comparison is based on one or more sort keys extracted from each line of the input file
The line. The sort key defines the smallest sequence of characters used to sort.
The-NR is sorted by number in flashbacks.
-M merges the ordered files uniformly, but does not do the sorting.
-C checks to see if the given file is well-ordered, and if not, displays an error message and does not sort.
-U is used with the-c option and is strictly checked in order; otherwise, only the first row is output for the sorted repeating row.
-o file name the sort output is placed in the file specified by the filename. If the file does not exist, a new file is created.

The options for changing the collation include:
-D is sorted in dictionary order and only whitespace and alphanumeric characters are considered when comparing.
-F ignores the case of letters.
-I ignores nonprinting characters.
-M The order of comparison for the month is (unknown) < "JAN" < "FEB" <...< "DEC".
-R sorts in reverse order. The default sort output is sorted in ascending order.
-K N1[,N2] Specifies the content as the sort key from the beginning of the N1 field in the text line to the middle of the N2 field (not including the N2 field). If there is no N2, the keyword is all fields from the N1 field to the end of the line. N1 and N2 can be in decimal form. such as "x.y", X represents the X field and y represents the Y-character in the X field. The fields and characters of the civilian are calculated starting from 1.
-B ignores leading white space characters (spaces or tabs) when comparing keywords.
The-t character takes the specified "character" as a delimiter between fields.

Uniq [Options] File
Description: The Uniq command reads the input file and compares adjacent rows, removing the duplicate row line. The result of the command processing is written to the output file.
The input file and output file must be different. The "One" is used to read from the standard input.
-C Displays the output at the beginning of each line plus the number of times the line appears in the file. Merges duplicate rows and records the number of repetitions.
-D displays only duplicate rows.
-f–skip-fields=n ignores the first N fields of the comparison.
-s–skip-chars=n ignores the first N fields of the comparison.
-U displays only rows that are not duplicates in the file.

Reference page:

Linux Log Analysis Script

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.