Shell log Analysis common commands and examples

Shell log Analysis common commands and examples _linux shell

Last Update:2017-01-18 Source: Internet

Author: User

Tags egrep

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Learn to use shell analysis log only one morning!!!

Many places to share the log analysis of the shell script, but basically did not say the specific meaning of each command, learning cost is very high, summed up here, to facilitate the quick start.

1, users under Windows to use the shell command, please install Cygwin, installation method Google (search technical questions please use Google, Baidu Search does not deserve)

2, the following rough introduction of SEO log analysis commonly used command character usage, you need to know more about each command character, please use Google.

Less filename View file contents Press "Q" to exit

Cat file name opens file, you can open several files multiple times | Cat 1.log 2.log |cat *.cat
grep-Parameter file name
-I is case insensitive
-V Displays all rows that do not meet the criteria
-C displays all rows that meet the criteria (number of conditions)

Egrep belongs to the upgraded version of grep, in the regular this piece of support more complete, the use of regular time to recommend the use of Egrep

Head-2 file name displays 2 rows
head-100 File name | Tail-10 >>a.log Extract file 第91-100 row data

WC-Parameter file name statistics text size, character number, number of lines
-C Statistic Text byte number
-m statistic text character count
-L Statistical text how many lines

sort– the file by the parameter file name
-N Sort files by number
-R Reverse Sort

Uniq-parameter file name to the file to go heavy, before you need to use the sort

Sort
-C Displays the number of times the data repeats

Split-parameter filename to cut a file
-100 (cut into one file per 100 lines)
-C 25m/b/k (split into one file per 25 MB/byte/k)

| Pipeline, transfer the result of the previous command to the next command

">" in the ">" and ">>" redirect write file is equivalent to "W" emptied and written ">>" equivalent to "a" appended to the file

Awk-f ' Split ' pattern {action} file name uses the specified character to segment each row of data, default is a space (site log is a space separate)
-F followed by a separator
Pattern is the condition of action execution, where regular expressions can be used
$n the first few paragraphs of data to represent the entire row of data
NF indicates the number of fields in the current record
$NF represents the Last field
Begin and end, both of which can be used in pattern, provide the function of beginning and ending to give the program an initial state and perform some cleanup work after the program finishes

Bash shell.sh run shell.sh script

Dos2unix xxoo.sh converts "\ r \ n" to "\ n" Windows-->linux (because of the different line breaks under Windows and Linux, so our code under Windows requires the use of Dos2unix Convert to a line break under Linux, or run a shell script will complain.

Unix2dos xxoo.sh converts "\ n" to "\ r \ n" linux-->windows
RM xx.txt Delete xx.txt files

3, some simple command to introduce here, need to understand the shell, suggest you see the relevant books.

Now let's start using the shell analysis log

1, cutting Baidu's crawl data (the file cut out to the specific Crawler data processing can improve efficiency)

Copy Code code as follows:

Cat Log.log |grep-i ' Baiduspider ' >baidu.log

2, the number of Site status code query

Copy Code code as follows:

awk ' {print $} ' Baidu.log|sort|uniq-c|sort-nr

3, Baidu total Crawl Quantity

Copy Code code as follows:

Wc-l Baidu.log

4, Baidu does not repeat the crawl quantity

Copy Code code as follows:

awk ' {print $} ' baidu.log|sort|uniq|wc-l

5, Baidu average data size per crawl (the result is KB)

Copy Code code as follows:

awk ' {print $} ' Baidu.log|awk ' begin{a=0}{a+=$1}end{print a/nr/1024} '

6, Home Grab quantity

Copy Code code as follows:

awk ' $7~/\.com\/$/' baidu.log|wc-l

7. Fetching quantity of a certain catalogue

Copy Code code as follows:

grep '/news/' baidu.log|wc-l

8, grab the most 10 pages

Copy Code code as follows:

awk ' {print $} ' baidu.log|sort|uniq-c|sort-nr|head-10

9, find the crawl of the 404 error page

Copy Code code as follows:

awk ' $9~/^404$/{print $} ' Baidu.log|sort|uniq|sort-nr

10, find out how many JS files crawled and the number of files crawled

Copy Code code as follows:

awk ' $7~/.js$/{print $} ' baidu.log|sort|uniq-c |sort-nr

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More