1. Take access to the top 10 IP address
Cat Access.log|awk ' {print '} ' |sort|uniq-c|sort-nr|head-10
Cat Access.log|awk ' {counts[$ (11)]+=1}; END {for (URL in counts) print Counts[url], url} '
2. Most visited files or pages, such as get top 10
Cat Access.log|awk ' {print $11} ' |sort|uniq-c|sort-nr|head-10
3. Output transmission of the largest number of EXE files, in the analysis of the download station will be used
Cat Access.log |awk ' ($7~/.exe/) {print $ "" $ "" $4 "" $7} ' |sort-nr|head-20
4. Output more than 100000byte (about 100kb) exe files and corresponding file occurrences
Cat Access.log |awk ' ($ > 100000 && $7~/.exe/) {print $7} ' |sort-n|uniq-c|sort-nr|head-50
5. If the last column of the log records the paging file transfer time, list the most time-consuming pages to the client
Cat Access.log |awk ' ($7~/.php/) {print $NF "" $ "" $4 "" $7} ' |sort-nr|head-50
6. List the most time-consuming pages (more than 60 seconds) and the number of occurrences of this page
Cat Access.log |awk ' ($NF > && $7~/.php/) {print $7} ' |sort-n|uniq-c|sort-nr|head-100
7. List files that have been transmitted for longer than 30 seconds
Cat Access.log |awk ' ($NF >) {print $7} ' |sort-n|uniq-c|sort-nr|head-20
8. Statistics website Traffic (G)
Cat Access.log |awk ' {sum+=$10} END {print sum/1024/1024/1024} '
9. Statistics 404 of the Connection
awk ' ($9 ~/404/) ' Access.log | awk ' {print $9,$7} ' | Sort
10. Statistics HTTP Status
Cat Access.log |awk ' {counts[$ (9)]+=1}; END {for (code in counts) print Code,counts[code]} '
Cat Access.log |awk ' {print $9} ' |sort|uniq-c|sort-rn
10. Reptile analysis to see which spiders are crawling content.
/usr/sbin/tcpdump-i Eth0-l-S 0-w-DST Port 80 | Strings | Grep-i User-agent | Grep-i-E ' bot|crawler|slurp|spider '
Apache website Log Analysis