Assume that the apache log format is:
118.78.199.98--[09/Jan/2010: 00: 59: 59 + 0800] "GET/Public/Css/index.css HTTP/1.1" 304-"http://www.a.cn/common/index.php”" Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6.3 )"
Problem 1: In apachelog, find the top 10 IP addresses with the most accesses.
Awk '{print $1}' apache_log | sort | uniq-c | sort-nr | head-n 10
Awk first captures the IP addresses in each log. If the log format is customized, you can use-F to define the delimiter and print the specified columns;
Sort for the first time to sort the same records together;
Upiq-c combines repeated rows and records the number of repeated rows.
Head to filter the top 10;
Sort-nr is sorted in reverse order by number.
The commands I reference are:
Display 10 most commonly used commands
Sed-e "s/| // n/g "~ /. Bash_history | cut-d ''-f 1 | sort | uniq-c | sort-nr | head
Question 2: Find the maximum number of accesses in the apache log for several minutes.
Awk '{print $4}' access_log | cut-c 14-18 | sort | uniq-c | sort-nr | head
[09/Jan/2010: 00: 59: 59;
Extract 14 to 18 characters from cut-c
The remaining content is similar to question 1.
Question 3: Find the most visited page in the apache log:
Awk '{print $11}' apache_log | sed's/^. * cn /(. */)/"// 1/G' | sort | uniq-c | sort-rn | head
Similar to Problems 1 and 2, the only special feature is to replace "http://www.a.cn/common/index.php?" with the internal content of" http://www.a.cn (/common/index. php) "with the sed replacement function )"
Question 4: In the apache Log, find the maximum number of times of access (the most load) (in minutes), and then check which IP addresses have the most access at these times?
1. view the apache process:
Ps aux | grep httpd | grep-v grep | wc-l
2. View tcp connections on port 80:
Netstat-tan | grep "ESTABLISHED" | grep ": 80" | wc-l
3. Check the number of ip connections on the current day in the log to filter duplicates:
Cat access_log | grep "19/May/2011" | awk '{print $2}' | sort | uniq-c | sort-nr
4. What are ip addresses with the highest number of ip connections doing on that day (originally spider ):
Cat access_log | grep "19/May/1::00" | grep "61.135.166.230" | awk '{print $8}' | sort | uniq-c | sort-nr | head-n 10
5. Visit the top 10 URLs on the page on the current day:
Cat access_log | grep "19/May/2010:00" | awk '{print $8}' | sort | uniq-c | sort-nr | head-n 10
6. Use tcpdump to sniff access to port 80 to see who is the highest
Tcpdump-I eth0-tnn dst port 80-c 1000 | awk-F ". "'{print $1 ". "$2 ". "$3 ". "$4} '| sort | uniq-c | sort-nr
Then, check the log to see what the ip address is doing:
Cat access_log | grep 220.181.38.183 | awk '{print $1 usd/t "$8}' | sort | uniq-c | sort-nr | less
7. view the number of ip connections in a certain period of time:
Grep "2006:0 [7-8]" www20110519.log | awk '{print $2}' | sort | uniq-c | sort-nr | wc-l
8. the 20 most connected IP addresses in the current WEB Server:
Netstat-ntu | awk '{print $5}' | sort | uniq-c | sort-n-r | head-n 20
9. view the top 10 IP addresses with the most visits in the log
Cat access_log | cut-d ''-f 1 | sort | uniq-c | sort-nr | awk '{print $0}' | head-n 10 | less
10. view more than 100 IP addresses in the log
Cat access_log | cut-d ''-f 1 | sort | uniq-c | awk '{if ($1> 100) print $0}' | sort-nr | less
11. view the most recently visited files
Cat access_log | tail-10000 | awk '{print $7}' | sort | uniq-c | sort-nr | less
12. view the pages that have been accessed for more than 100 times in the log
Cat access_log | cut-d ''-f 7 | sort | uniq-c | awk '{if ($1> 100) print $0}' | less
13. List objects that have been transferred for more than 30 seconds.
Cat access_log | awk '($ NF> 30) {print $7}' | sort-n | uniq-c | sort-nr | head-20
14. List the most time-consuming pages (more than 60 seconds) and their occurrence times
Cat access_log | awk '($ NF> 60 & $7 ~ //. Php/) {print $7} '| sort-n | uniq-c | sort-nr | header-100