Grep sed awk find usage Induction

Source: Internet
Author: User
Tags sin cos

0: Basics of Regular Expressions

^ First line ID

$ End or end of an article

. Represents any character

? Indicates the occurrence of a forward character.

* Indicates the occurrence of 0 or multiple prefix characters.

[1-9] indicates a string of 1 to 9 characters.

[^ 1-9] indicates a character that does not contain 1-9.

/<First-word logo

/> Suffix ID

/(/) Reference ID, which can be referenced multiple times and referenced in/1/2 later

X/{M, N/} indicates that X appears at least m times and at most N times.

Regular Expression Extension

| Used to use multiple regular conditions. match one of them.

+ Similar to. *, it indicates one or more repeated characters.

() Is used to group multiple Contents


1 grep

Common Format: grep [Option] regx File


-E: use regular expression Extension

-E pattern: use the regular expression in pattern.

-F file: use the regular expression in the file.

-I case-insensitive (poor performance, it is best to convert it to uppercase or lowercase using TR)

-V reverse display of unmatched rows

-V: display version number

Output Control Options:

-N: Output row number

-Q: unmatched content is not displayed.

-R recursively scans files

-L only output matching file names

-L only output unmatched file names

-C: only the number of matched items is displayed.

Common examples:

Grep-I 'root'/etc/passwd case-insensitive display file contains root lines

Grep-V '^ root'/etc/passwd: The switch is not root or line

Grep-N 'root'/etc/passwd: display the row containing the root in the file and print the row number in the file.

Grep-LR '$ root'/etc recursive search/etc contains the file name with the end of the row as root

Grep-LR 'root'/etc recursive search for files in/etc that do not contain the root file name

Grep-C 'root'/etc/passwd

Grep "Hello | world" 1.cpp matches the rows that contain hello and world.



Common Format: sed [Option] [-E script] 'address command/CMD Arg 'file

'/Regx/command/CMD Arg'

Sed processing model: sed places a row of a file in the mode buffer, and then compares the matching mode. If the matching mode matches, execute the matching command to output the unmatched lines, then output the processed rows.

Common options:

-F: Specifies the script file name to be filtered.

-E followed by a matching expression

-N: the default output is not displayed.

Sed script command:

Sed '/root/A/text'/etc/passwd adds a new line of text after the row of the root object.

Sed '/root/C/text'/etc/passwd should be replaced with text in the root line of the file

Sed '/root/I/text'/etc/passwd insert text before matching rows

Sed '/root/D'/etc/passwd Delete the row containing root in the file

H/h replication or additional mode buffer to a buffer

G/G extracts from the buffer and copies or attaches it to the current mode buffer.

Sed-e '/root/{H; D;}'-e' $ G'/etc/passwd put the root row in the last row

P print row

Sed-n'/root/{n; P;} '/etc/passwd print the next row of the root row

Sed '1, 3Y/abcdef/'/etc/passwd ing 1-3 rows in upper case

S/XXX/YYY/g text replacement

Sed address:

Sed-n'1, 3 p '/etc/passwd print 1-3 lines of Files

Sed-n'/root/,/sshd/P'/etc/passwd print the line between the root row and sshd row of the file

Sed-n'5,/^ northeast/P' File


3 awk

Common Format: gawk 'pattern' {action} 'file

CMD | gawk 'pattern{ action }'

If no pattern exists, action is used for all rows. If no action exists, the matching row is printed. You can use various Defined variables $0, NF, NR, and so on in pattern.

Working principle: awk scans a row and places it in the variable $0. Then the row is separated into various fields and separated with the specified separator. The default value is space, which can be specified through the FS parameter. Each domain is stored in variable $ I, with a maximum of 100 domains.

Gawk-F: '{print $1}'/etc/passwd print all user names

Formatted output:
Print supports escape characters and output numeric formats defined by the ofmt variable
Nawk '/Sally/{print "/T/thave a nice day," $1, $2 "/! "} 'Ployees
Nawk 'in in {ofmt = "%. 2f"; print 1.2456789, 12e 2 }'
Printf supports all functions of functions with the same name in C Language
Echo "Unix" | nawk '{printf "| %-15s |/N", $1 }'
Nawk '{printf "the name is: %-15 s ID is % 8d/N", $1, $3}' employees
Domain separator:
Nawk F' [:/T] ''{print $1, $2, $3} 'employees

Pattern {Action Statement; etc .}
Pattern can be a regular expression, a conditional expression, or even a mathematical operation.
~ Match calculation nawk '$1 !~ /Ly $/'ployees
Comparison expression: supported ==,>=, <= ,! = ,~,!~ And supports multiple comparison expressions.
Awk '$3 * $4> 500' filename
Range mode:
Awk '/Tom/,/Suzanne/' filename

Action in {} is very similar to a C-language clause, which can be nested clauses. Conditions, loops, variable function definitions, custom or internal variables, and internal functions can be used, powerful capabilities such as calling system commands and input/output redirection.
Variable: Var = value. If the variable is not initialized, the string is "" and the number is 0.
Nawk '$1 ~ /Tom/{Wage = $2 * $3; print wage} 'filename
Built-in variables:
Argc Number of command-line argument
Argind index in argv of the current file being processed from the command line (gawk only)
Argv array of command-line arguments
Convfmt conversion format for numbers, %. 6g, by default (gawk only)
Environ an array containing the values of the current environment variables passed in from the shell
Errno contains a string describing a system error occurring from redirection when reading from the Getline function or when using the close function (gawk only)
Fieldwidths A whitespace-separated list of fieldwidths used instead of FS when splitting records of fixed fieldwidth (gawk only)
Filename name of current input file
FNR record number in current file
FS the input field separator, by default a space
Ignorecase turns off case sensitisions in regular expressions and string operations (gawk only)
The number of fields currently recorded by NF. $ NF can be referenced to the last field.
Current Nr Record Number
Ofmt output format for numbers
OFS output field separator
ORS output record Separator
Rlength length of string matched by match Function
RS input record Separator
Rstart offset of string matched by match Function
RT the record Terminator; gawk sets it to the input text that matched the character or RegEx specified by RS
Subsep subscript Separator
The action followed by the begin mode indicates the action performed before awk processes the text. It can be used to initialize various internal variables or other actions.
The action followed by the end mode indicates the action performed after the awk processing is complete.
Nawk '$4 >=70 {print $1, $2> "passing_file"}' filename
Nawk 'in in {While ("ls" | Getline) print }'
Condition Statement
{If ($3> 89 & $3 <101) Agrade ++
Else if ($3> 79) bgrade ++
Else if ($3> 69) cgrade ++
Else if ($3> 59) dgrade ++
Else fgrade ++
Loop: supports the standard loop structure of while and for and break and continue.
For (x = 3; x <= NF; X ++)
If ($ x = 0) {print "Get next item"; Continue}
Array: awk arrays are map-type, and indexes can be numbers or strings. Multi-dimensional arrays are also supported.
Nawk '{ID [Nr] = $3}; end {for (x = 1; x <= nR; X ++) print ID [x]}' employees
Nawk '/^ Tom/{name [Nr] = $1}; end {for (I in name) {Print name [I]}' DB
Nawk '{count [$2] ++} end {for (name in count) Print name, Count [name]} 'datafile4
Split (string, array, FS) Splits string into multiple fields according to the delimiter FS and places them in array.

Built-in functions:
(G) sub (regx, String, [tstring]) (at the tstring position) replaces the first (all) occurrence of regx with a string.
Index (string, substr) returns the position of the substring
Length (string) returns the length of the string.
Substr (string, start, [Len]) returns a string whose start length is Len.
Match (string, regx) returns the matching position of the Regular Expression in string.
Sprintf () returns a string in the specified format
Awk '{line = sprintf ("% 15 S % 6.2f", $1, $3); print line}' filename
Sin cos exp int log Rand atan2 SQRT srand
Substring is often used to format a domain with a fixed length but no separator. Gsub is usually used to replace some useless characters. It makes more sense to use the replaced string.
Custom functions:
Function Name (parameter ,...)
Return expression


4 find

Command Format: Find [Option] [path] [expression]

The find option has only three P, L, H, which are used to control whether follow link files are required.-P indicates not follow, and-l indicates follow, -H indicates that the link is follow only when it appears in the [path]; otherwise, it is not follow.

[Path] indicates the directory to be searched by FIND. If it is null, the current directory is used by default.

The trick of find is to use a [expression], which can be used to specify a specific file name, a file accessed or modified at a specific time, a specific file larger than the one whose name complies with a specific regular expression, etc.

Find can be followed by multiple expressions. You can use-and-or-not to connect multiple expressions.

A complete expression consists of three parts: [Option] [test] [execution action]

Example: Find/tmp-maxdepth 1-name "*. log "-exec cat {}\; where-maxdepth 1 is the option,-name" *. log is the test condition,-exec... yesaction


Common expressions

-Maxdepth levels: maximum depth of the command line parameter path. 0 indicates that only tests and actions are applied to command line parameters.

-Mindepth levels: do not apply the maximum number of attempts in the parameter path. 1 indicates that only the attempts in the parameter path are processed, and the command line parameters are not processed.

-Noleaf: No optimization is required. Assume that the directory contains two subdirectories that are empty.

-Regextype: Specifies the regular type of the test type. The default value is Emacs. Other types include awk, basic, and egrep.

-Xdev: do not go to other file systems


Common tests (+ N represents> n-n represents <n represents = N)

-Name pattern file name Matching Test

-Wholename pattern: matching by shell mode

-Lname only matches Link

-INAME case-sensitive matching

-The last access time of Amin n files is n minutes ago.

-Cmin N: The File status changes before n minutes.

-The mmin n file is last modified before n minutes.

-Atime-ctime,-mtime file ** n days ago


-UID/-user uname

-Gid N/-group gname file group attribute match


-Perm mode File Permission Test

-RegEx pattern: test the specified regular expression.

-Whether samefile name is the same file

-Size N [cwbkmg] the file size is n [unit] Test

-Type T file type test

-The last access time of the anewer file is later than the last modification time.


Find action

-Delete: delete an object

-Exec cmd: Execute the command. The command after exec is ";" as the Terminator. Escape the command by "\" or use the "'" number to prevent it from being processed by shell.

-OK cmd: Same as above, but you will be asked before executing cmd

-Print: runs by default. The ls information of the file is printed.

-Printf **: Specifies the specific information of the output file in the specified format.

-Quit: exit now

-Prune: If-depth is not provided, it is true. If the file is a directory, it is not entered. False.

Note that in some systems,-exec will export all execution results to subsequent commands, and some command input parameters may overflow. In this case, xargs can be used instead of exec.


Find example:

1. check whether there are files named "AAA" in the current folder and subfolders.

Find.-Name aaa

2. Check whether the "AAA" directory exists in the current folder and subfolders.

Find.-type D-name aaa

3. Find all files with the suffix "cmd.txt" in the front folder and sub-files.

Find.-Name "*. txt"

4. Find out which files are owned by the "root" user in the current directory and Its subfolders

Find.-User Root

5. Search for all files in the current folder and subfolders whose permissions are set to 644.

Find.-Perm 644

6. Search for files with both B and 3 characters in the current folder and subfolders: the regular expression technology is used.

Find.-RegEx '. * B. * 3 ′

7. If you want to output all the content of the "* Core *" file found by using the find command

Find.-Type F-name "core *"-exec cat {}\;

Find/tmp-name "core *"-type F-print | xargs cat

8. Find the files that have been accessed in the current directory within 5 minutes.

# Find.-Amin-5

9. Search for all files with a file size greater than 10 MB in the current directory and subdirectory.

# Find.-size + 10 m

10. All the above FIND commands are used to find the current directory and Its subdirectories. If you do not want to go deep into the subdirectory, but only search for the current directory, you can:

# Find.-maxdepth 1-name "*. C"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.