Common command line 1 for troubleshooting in LINUX. view TCP connection status netstat-nat | awk '{print $6}' | sort | uniq-c | sort-rn www.2cto.com netstat-n | awk '/^ tcp/{++ S [$ NF]}; END {for (a in S) print a, S [a]} 'netstat-n | awk'/^ tcp/{++ state [$ NF]}; END {for (key in state) print key, "\ t ", state [key]} 'netstat-n | awk'/^ tcp/{++ arr [$ NF]}; END {for (k in arr) print k, "\ t ", arr [k]} 'netstat-n | awk'/^ tcp/{print $ NF} '| sort | uniq-c | sort-rn netstat-ant | awk' {print $ NF} '| grep-V' [a-z]' | sort | uniq-c netstat-ant | awk '/ip: 80/{split ($5, ip, ":"); ++ S [ip [1]} END {for (a in S) print S [a], a} '| sort-n netstat-ant | awk'/: 80/{split ($5, ip ,":"); ++ S [ip [1]} END {for (a in S) print S [a], a} '| sort-rn | head-n 10 awk' BEGIN {printf ("http_code \ tcount_num \ n ")} {COUNT [$10] ++} END {for (a in COUNT) printf a "\ t" COUNT [a] "\ n"} 'www.2cto.com 2. for more than 20 requests, see netstat-anlp | grep 80 | grep tcp | awk '{print $5}' | awk-F: '{print $1}' | sort | uniq-c | sort-nr | head-n20 netstat-ant | awk '/: 80/{split ($5, ip, ":"); ++ A [ip [1]} END {for (I in A) print A [I], i} '| sort-rn | head-n20 3. use tcpdump to sniff access to port 80 to see who has the highest tcpdump-I eth0-tnn dst port 80-c 1000 | awk-F ". "'{print $1 ". "$2 ". "$3 ". "$4} '| sort | uniq-c | sort-nr | head-20 4. find more time_wait connections to netstat-n | grep TIME_WAIT | awk '{print $5}' | sort | uniq-c | sort-rn | head-n20 5. find more SYN connections. netstat-an | grep SYN | awk '{print $5}' | awk-F: '{print $1}' | sort | uniq-c | sort-nr | more 6. according to the port column process netstat-ntlp | grep 80 | awk '{print $7}' | cut-d/-f1 www.2cto.com website log analysis (Apache ):
1. obtain the first 10 IP addresses of cat access. log | awk '{print $1}' | sort | uniq-c | sort-nr | head-10 cat access. log | awk '{counts [$ (11)] + = 1}; END {for (url in counts) print counts [url], url}' 2. take the first 20 files or pages with the most visits and count all access IP addresses. log | awk '{print $11}' | sort | uniq-c | sort-nr | head-20 awk '{print $1}' access. log | sort-n-r | uniq-c | wc-l 3. list the largest number of exe files transmitted (commonly used when analyzing download sites) cat access. log | awk '($7 ~ /\. Exe/) {print $10 "" $1 "" $4 "" $7} '| sort-nr | head-20 4. lists the exe files with an output greater than 200000 bytes (about KB) and the number of occurrences of the corresponding files cat access. log | awk '($10> 200000 & $7 ~ /\. Exe/) {print $7} '| sort-n | uniq-c | sort-nr | head-100 5. if the last column of the log records the page file transfer time, the most time-consuming page cat access is listed on the client. log | awk '($7 ~ /\. Php/) {print $ NF "" $1 "" $4 "" $7} '| sort-nr | head-100 6. list the most time-consuming pages (more than 60 seconds) and the corresponding page occurrence times cat access. log | awk '($ NF> 60 & $7 ~ /\. Php/) {print $7} '| sort-n | uniq-c | sort-nr | head-100 7. lists the cat access files that have been transferred for more than 30 seconds. log | awk '($ NF> 30) {print $7}' | sort-n | uniq-c | sort-nr | head-20 8. count website traffic (G) cat access. log | awk '{sum + = $10} END {print sum/1024/1024/1024}' 9. count 404 of connected www.2cto.com awk '($9 ~ /404/) 'access. log | awk '{print $9, $7}' | sort 10. count http status. cat access. log | awk '{counts [$ (9)] + = 1}; END {for (code in counts) print code, counts} 'cat access. log | awk '{print $9}' | sort | uniq-c | sort-rn 11. concurrency per second: awk '{if ($9 ~ /200 | 30 | 404/) COUNT [$4] ++} END {for (a in COUNT) print, COUNT [a]} '| sort-k 2-nr | head-n10 12. bandwidth statistics cat apache. log | awk '{if ($7 ~ /GET/) count ++} END {print "client_request =" count} 'cat apache. log | awk '{BYTE + = $11} END {print "client_kbyte_out =" BYTE/1024 "KB"}' 13. count the number of objects and the average size of objects. log | awk '{byte + =10 10} END {print byte/NR/1024, NR} 'cat access. log | awk '{if ($9 ~ /200 | 30/) COUNT [$ NF] ++} END {for (a in COUNT) print a, COUNT [a], NR, COUNT [a]/NR * 100 "%"} 14. obtain the 5-minute log if [$ DATE_MINUTE! = $ DATE_END_MINUTE]; then # determine whether the start timestamp is equal to the end timestamp START_LINE = 'sed-n "/$ DATE_MINUTE/=" $ APACHE_LOG | head-n1 '# if not, the row number of the Start timestamp is retrieved, line number with the end timestamp # END_LINE = 'sed-n "/$ DATE_END_MINUTE/=" $ APACHE_LOG | tail-n1 'end_line = 'sed-n "/$ DATE_END_MINUTE/=" $ APACHE_LOG | head-n1 'sed-n "$ {START_LINE }, $ {END_LINE} p "$ APACHE_LOG> $ MINUTE_LOG # Use the row number, extract the log content within 5 minutes and store it in the temporary file. GET_START_TIME = 'sed-n "$ {START_LINE} p" $ APACHE_LOG | awk-F' [''{print $2} '| awk' {print $1} '| www.2cto.com sed's #/# G' | sed's #: # ''# obtain the start timestamp GET_END_TIME = 'sed-n" $ {END_LINE} p "$ APACHE_LOG | awk-F' ['' {print $2} '| awk' {print $1}' | sed's #/# G' | sed's #: # ''# obtain the end timestamp through the row number 15. view which crawlers are capturing the content/usr/sbin/tcpdump-I eth0-l-s 0-w-dst port 80 | strings | grep-I user-agent | grep -I-E 'bot | crawler | slurp | spider's website daily Analysis 2 (Squid) 2. traffic Statistics by domain zcat squid_access.log.tar.gz | awk '{print $10, $7} '| awk' BEGIN {FS = "[/]"} {trfc [$4] + = $1} END {for (domain in trfc) {printf "% s \ t % d \ n", domain, trfc [domain]} 'more efficient perl version please download: http://docs.linuxtone.org/soft/tools/tr.pl database Article www.2cto.com 1. view the SQL/usr/sbin/tcpdump-I eth0-s 0-l-w-dst port 3306 | strings | egrep-I 'select | UPDATE | DELETE | INSERT | SET | COMMIT | ROLLBACK | CREATE | DROP | ALTER | CALL 'System Debug Analysis 1. debug the strace-p pid command. trace the PID of the specified process, gdb-p pid, author newcmd