Linux three big file processing tools (Grep/sed/awk)

Source: Internet
Author: User
Tags print format

grep, SED and awk are three very powerful file processing tools for Linux.

Gerp Find, sed edit, awk analyzes and processes it based on content.

Now let's find out how these three file processing tools are different (certainly different, or why there are three kinds ...). )

grep (keyword: intercept)

Text-gathering tool, combined with regular expression is very powerful
Main parameters []

-C: Output only matching rows -I: case-insensitive-H: File name is not displayed when querying multiple files -L: When querying multiple files, only the file name with matching characters is output -N: Displays the matching line number and line /C6>-V: Show all lines that do not contain matching text (I often use the remove grep itself)

Basic how to work: grep to match the content file name, for example:

' Test ' d*'test'[a-z]\{5}\ ' AA Displays all strings with at least 5 consecutive lowercase letters containing a string

grep is a simple file-processing tool that matches the keyword

sed (keyword: edit)

Text editing tools with behavioral units sed can directly modify the file, but it is generally not recommended to do so, you can analyze the standard input
Basic working methods: sed [-NEF] ' [action] ' [input text]
-N: Quiet mode, in general SED usage, data from stdin is generally listed on the screen, and if the-n parameter is used, only the line that is processed by SED is listed.
-E: Multiple edits, such as you want to delete a row at the same time, and want to change the other rows, then you can use Sed-e ' 1,5d '-e ' s/abc/xxx/g ' filename
-F: First write the action of SED in a file, and then through the Sed-f ScriptFile can directly perform the SED action within the ScriptFile (no experimental success, not recommended)
-I: Direct editing, this time is really changing the contents of the file, and everything else just change the display. (Not recommended)
Action:
A is added, a string is followed by a, and the string appears on a new line. (Next line)
C supersedes, C-strings, which can replace rows between n1,n2
D Delete, not after anything.
I insert, followed by a string that appears on the previous line
P prints, lists selected data, usually works with Sed-n sed-n ' 3p ' only prints line 3
s substitution, similar to the substitution in VI, 1,20s/old/new/g

Q Exit, match to a line to exit, improve efficiency

R matches the line to read a file for example: sed ' 1r qqq ' ABC, note that the written text is written on the 1th line behind, that is, line 2nd

W file, matched to a line to write to a file such as: Sed-n '/m/w QQQ ' ABC, reads a line with m from ABC to write to the QQQ file, note that this write has coverage.

Example:
Sed ' 1d ' ABC deletes the first line in the ABC file, noting that all rows except the first line are displayed, because the first row has been deleted (the actual file has not been deleted, but the display was deleted)
Sed-n ' 1d ' ABC does not show anything, because the line processed by SED is a delete operation, so it is not realistic.
Sed ' 2, $d ' ABC removes all of the contents from the second line to the last line in ABC, note that the $ sign in the regular expression represents the end of the line, but this does not say that the end of the line, it refers to the end of the last line, ^ start, if not specify which line begins, then the first row begins
Sed ' $d ' abc only deletes the last line because it is not specified as the end of the line, and it is considered the end of the last line
All rows with test in sed '/test/d ' abc file, all deleted
Sed '/test/a rrrrrrr ' ABC appends rrrrrrr to all of the next lines with the test line also possible by line sed ' 1,5c rrrrrrr ' abc
Sed '/test/c rrrrrrr ' ABC will rrrrrrr replace all rows with test, of course, this can be replaced by the line, such as sed ' 1,5c rrrrrrr ' abc

awk (keyword: analysis & processing)

Parsing of one line of processing awk ' condition type 1{action 1} condition type 2{action 2} ' filename, awk can also read standard input from the previous instruction
Unlike SED, which is often used for a whole line of processing, awk prefers to divide a row into several "fields" (areas), with the default delimiter being the SPACEBAR or TAB key
For example:
Last-n 5 | awk ' {print ' \ t ' $ ' $} ' there is no space between "\ t" in curly braces here, but it is best to add a space, and note that "\ T" is double-quoted, because the contents are in single quotes
$ A represents the entire row, representing the first region, and so on
The processing process for awk is:
1. Read the first line and fill in the first line with the variable ... In equal variables
2. Perform the action according to the condition limit
3. Next line of execution
As a result, awk is processed one line at a time, and the smallest unit processed at a time is a region
There are also 3 additional variables,

NF: The number of fields processed per line,

NR currently processed to the first few lines

FS Current delimiter
Logic judgment > < >= <= = =!==, assignment direct use =
cat/etc/passwd | awk ' {fs= ': '} $3<10 {print ' \ t ' $ ' $} ' first defines the delimiter as:, then judge, watch, judge not write in {}, then perform the action, fs= ":" This is an action, assignment action, not a judgment, so do not write in {}
Begin END, which gives the programmer an initialization and finishing work, the actions listed after begin will be executed within {} Before awk begins scanning the input, and the actions within end{} will be executed after the input file has been scanned.
awk '/test/{print NR} ' ABC prints the line number of the line with test, note that regular expressions can be used between//
Within awk {}, you can use if else, for (i=0;i<10;i++), i=1 while (I<NF)
It can be seen that many of Awk's uses are equivalent to the C language, such as the "\ T" delimiter, print format, if, while, for, and so on

Awk is a fairly complex tool, and when it's really used, add it.

Linux three big file processing tools (Grep/sed/awk)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.