International - English

Cart Console

Topic Center

Contact Sales

Home > Developer > Shell

Some shell command tricks in log processing

Last Update:2014-11-21 Source: Internet

Author: User

Tags diff apache access log

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Some shell command tricks in log processing

Quirks's log analysis, the future of the uncertainty of the day, the only thing that may also be a bit of a point is to manually handle a large number of logs. Sum up.

The input to the log file is the text of a few G moving. Get a list, a number, a scale from n such a file. In any case, the shell command is not only a method to verify the accuracy of the system data, but also a good learning process.

Cutting journal lines with the Cut command

The following line is a typical Apache access log:

If you need to get an IP address you can use the cut command

-d ‘ ‘The result is that the line is cut by a space and the -f1 first field is taken, so that the IP list

Sometimes the files you get are delimited or cut-cut \t , but you need to write more than one $

Use the TR command to remove characters, replace characters

-c:complement, replacing characters that are not contained in SET1 with SET2
-d:delete, delete all characters in SET1, do not convert
-s:squeeze-repeats, compressing repeated characters in a SET1
-t:truncate-set1, convert SET1 with SET2, usually the default is-t

If you get the split file,

[Email protected]:~/dhcptest$ echo "AAACCCDDD ss" | TR- s [A-c] #-S
ACDDD SS

[Email protected]:~/dhcptest$ echo "AAACCCDDD ss" | TR- s "" "," #d和s之间有2个空格, replace after compression repeat
AAACCCDDD,SS,

[Email protected]:~/dhcptest$ echo "AAACCCDDD ss" | Tr-t "" ","
AAACCCDDD,,SS,

[Email protected]:~/dhcptest$ echo "AAACCCDDD ss" | Tr-s "A" "B" #替换后压缩重复
BCCCDDD SS

Replace the space with the file into CSV

The above command removes the space directly

After the log processing will often appear empty lines, the TR command to remove the empty line principle is to replace two consecutive lines of a line break

Use the Uniq command to reset

Think of the IP list for the IP list to be accessed independently.

If you want to count each IP access number, you can add a parameter C

The resulting format is as follows:

The preceding number is the number of occurrences.

Using Awk/seed to process logs

Awk/seed is the ultimate balm for processing logs. It is true that everything can be done. Awk/seed is a great science. Here's a log I came across, formatted like this:

If I need to get Isactive=1 's journal line, take it to out= ' previous paragraph, like ABC above.

The function of grep is isActive=1 to filter the rows. Awk followed by "is the awk language." $0always represents the currently matched field value, match substr is a function that awk can use, and the code in {} after match is executed. When the match,$0 is the regular matching part. Rstart,rlength is a constant that awk can use, representing the beginning of the match starting subscript, Rlength is the length of the match.

It is not possible to use the ' light escape ' in ' \x27 ' with the 16-binary code. Turn 16 binary You can use Python code to "‘".encode("hex") get

It was a surprise that awk explained it so simply, but it wasn't even a primer.

Collection operations

Imagine I want to get two lists of communication, and set, the difference set, statistics are often encountered, such as I want to get yesterday is the access to the IP today, in fact, today's IP list and yesterday's IP list intersection.

Define two simple files first:

If you want to get the intersection of AB 4 5, you can use the following command:

If you want to get the 1-9 of the assembly, you can:

If you want to get the difference of AB, that is, a removes the intersection of AB 1 2 3

In the same vein: The difference set of BA:

The above two commands are equivalent

Comm command is compare function, if any parameters are not brought what?

The diff command used to look at what the code changed:

Diff A.txt B.txt

Summary && References

I thought I could play around with these commands, and it's not much of a problem to handle a log.

A blog post describing collection operations in the shell:

Http://wordaligned.org/articles/shell-script-sets

A blog that has been placed on the shell side of the Favorites folder:

Common techniques for Linux shells

The Linux Shell Advanced Tips section of awk is well written

Some shell command tricks in log processing

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

system command in shell script sleep command in shell script if command in shell script with options command line arguments in shell script execute unix command in shell script how to execute command in shell script trim command in unix shell script

The Linux RZ command and the SZ command use the detailed ency... 01-18

Linux date format (time format) in Shell __linux 07-22

Batch file written with adb shell test 08-27

ADB Shell top uses 08-03

Resolve Zabbix "zbx_notsupported:timeout while executing a sh... 10-17

Access routers using Padavan firmware from the extranet (Pean... 10-06

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Some shell command tricks in log processing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support