grep, SED, awk real notes for the shell regex

Last Update:2016-09-18 Source: Internet

Author: User

Tags uppercase character egrep

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently has been studying the shell script this piece, just idle down to the bottom of their own better information on some of the examples, the following is the bird in my brother's private dishes inside the regular expression inside the comparison of some of the basic syntax, suitable for novice inspection.

First copy an example:

Copy the code code as follows:

# VI Regular_express.txt

-------------------------------

"Open Source" is a good mechanism to develop programs.

Apple is my favorite food.

Football game isn't use feet only.

This dress doesn ' t fit me.

However, this dress was about $3183 dollars.

GNU is free air isn't free beer.

She hair is very beauty.

I can ' t finish the test.

Oh! The soup taste good.

Motorcycle is cheap than car.

This window is clear.

The symbol ' * ' is represented as start.

Oh! My god!

The GD software is a library for drafting programs.

You are the mean of the best is. 1.

The world <Happy> are the same with "glad".

I like dog.

Google is the best tools for search keyword.

Goooooogle yes!

Go! Go! Let ' s go.

# I AM Vbird

--------------------------------

Set Language to C

Copy the code code as follows:

#export Lang=c

Grep

1. Search for a specific string "the"

Note: N is the display line number

Copy the code code as follows:

# grep-n ' the ' regular_express.txt

2. Reverse search for a specific string "the"

Copy the code code as follows:

# grep-vn ' the ' regular_express.txt

3. Get the string "the" in any case

Copy the code code as follows:

# grep-in ' the ' regular_express.txt

4. Use parentheses [] to search for set characters

When I search for the two words of test or taste, I find that they have a common ' t?st ', so we can search for them.

Copy the code code as follows:

# grep-n ' t[ae]st ' regular_express.txt

This is actually looking for t[a]st and t[e]st, the two separate characters.

If you are searching for OO characters, you can use:

Copy the code code as follows:

# grep-n ' oo ' regular_express.txt

If you do not want to search oo with g in front of OO, we can use reverse selection [^] to achieve:

Copy the code code as follows:

# grep-n ' [^g]oo ' Regular_express.txt

If you do not want to have lowercase characters in front of the search oo:

Copy the code code as follows:

# grep-n ' [^a-z]oo ' Regular_express.txt

Note: uppercase English/lowercase English/numbers can be written using [a-z]/[a-z]/[0-9], or can be written together

[A-za-z0-9] indicates that a string is required to be numeric and English

If we want to get the line with numbers, then:

Copy the code code as follows:

# grep-n ' [0-9] ' regular_express.txt

Note: However, given the influence of the language family on the encoding sequence, it is possible to use the [: lower:] Instead of a-Z and [:d igit:] Instead of 0-9 in addition to continuous encoding using the minus sign [-].

Copy the code code as follows:

# grep-n ' [^[:lower:]]oo ' Regular_express.txt

# grep-n ' [[:d igit:]] ' regular_express.txt

5. Display the string at the beginning of ' the '

Copy the code code as follows:

# grep-n ' ^the ' regular_express.txt

Displays the beginning of the line as lowercase characters

Copy the code code as follows:

# grep-n ' ^[a-z] ' regular_express.txt

6. Displays the end of the line as a point. The line

Copy the code code as follows:

# grep-n ' \.$ ' regular_express.txt

7. Display 5-9 rows of data

Copy the code code as follows:

# Cat-an Regular_express.txt |head-n |tail-n 6

8. Show Blank Lines

Copy the code code as follows:

# grep-n ' ^$ ' regular_express.txt

9. Find out G?? D string, beginning G four string ending D

Copy the code code as follows:

# Grep-n ' G.. d ' Regular_express.txt

O* represents the null character (that is, there are no characters) or one to n o characters, so grep-n ' o* ' regular_express.txt will print out all the lines,

11.oo* represents o+ empty characters or one to n o characters, so grep-n ' oo* ' regular_express.txt will print all lines of o,oo,ooo and so on

"Goo*g" stands for Gog,goog,gooog ... such as

Copy the code code as follows:

# grep-n ' Goo*g ' regular_express.txt

13. Find the line with the G...G string

Note: represents any character,. * represents an empty character or one to n any character

Copy the code code as follows:

# grep-n ' G.*g ' regular_express.txt

14. Find the rows that contain numbers

Copy the code code as follows:

# grep-n ' [0-9][0-9]* ' Regular_express.txt

or # grep-n ' [0-9] ' regular_express.txt

15. Find a string with two O

Note: {} because there is a special meaning in the shell, you need to jump off the character \ to make it meaningless.

Copy the code code as follows:

# grep-n ' o\{2\} ' regular_express.txt

Find the string after G that contains 2 to 5 O and ends with a G

Copy the code code as follows:

# grep-n ' Go\{2,5\}g ' regular_express.txt

Find the string after g that contains more than 2 O and ends with G

Copy the code code as follows:

# grep-n ' Go\{2,\}g ' regular_express.txt

Summarize:

^word indicates a string with a search (word) at the beginning of the line

word$ indicates a string with a search (word) at the end of a row

. Represents 1 arbitrary characters

\ denotes an escape character, preceded by a special character, will remove the meaning of the original special character

* represents repeating 0 to infinitely multiple previous re (regular expression) characters

[list] means searching for a string containing a list

[N1-N2] means searching for a specified range of strings, for example [0-9] [A-z] [a-z], etc.

[^list] represents a range of inverse strings, for example [0-9] for non-numeric characters, [A-z] for non-uppercase character ranges

\{n,m\} means finding N to m previous re characters

\{n,\} represents more than N of the previous re character

Egrep Summary:

+ represents repeating one or more of the previous re characters

Example: Egrep ' Go+d ' regular_express.txt

Expression Search (God) (good) (Goood) ... And so on string, o+ stands for [more than one o]

? Represents a repeat of 0 or one of the previous re characters

Example: Egrep ' Go?d ' regular_express.txt

Represents a search (GD) string, o represents [empty or 1 O]

Note: The result set under Egrep ' Go+d ' and ' go?d ' is equal to the ' go*d ' under grep

| To identify several strings in a way or (or)

Example: Egrep ' Gd|good|dog ' regular_express.txt

Represents a search (GD) or (God) or (God) string, | represents or

() indicates finding the group string

Example: Egrep ' G (la|oo) d ' regular_express.txt

Represents a search (glad) or (good) string

() + indicates identification of multiple repeating groups

Example: Echo ' axyzxyzxyzxyzxyzc ' |egrep ' A (xyz) +c '

Indicates that the search begins with a ending is C, with more than one ' xyz ' string in the middle

Sed:

Insert:

1. List and print line numbers for/etc/passwd, and delete 2-5 rows

Copy the code code as follows:

# NL/ETC/PASSWD | Sed ' 2,5d '

Note: sed is shorthand for sed-e, followed by single quotation mark

Delete Line 2nd as ibid.

Copy the code code as follows:

# NL/ETC/PASSWD | Sed ' 2d '

Delete the third row to the last row

Copy the code code as follows:

# NL/ETC/PASSWD | Sed ' 3, $d '

2. Add a row after the second line test

Copy the code code as follows:

# NL/ETC/PASSWD | SED ' 2a test '

Add a row of test before the second line

Copy the code code as follows:

# NL/ETC/PASSWD | Sed ' 2i test '

Add two lines of test after the second line

Copy the code code as follows:

# NL/ETC/PASSWD | SED ' 2a test \

> Test '

Replace line:

3. Replace 2-5 lines of content with No 2-5 number

Copy the code code as follows:

# NL/ETC/PASSWD | Sed ' 2,5c No 2-5 number '

4 list 第5-7 lines in/etc/passwd

Copy the code code as follows:

# nl/etc/passwd |sed-n ' 5,7p '

Replacement string:

Sed ' s/replaced by string/new string/g '

1. Get the line for the native IP

Copy the code code as follows:

#/sbin/ifconfig eth0 |grep ' inet addr '

Delete the previous part of IP

Copy the code code as follows:

#/sbin/ifconfig eth0 |grep ' inet addr ' | Sed ' s/^.*addr://g '

Remove the part of the IP after

Copy the code code as follows:

#/sbin/ifconfig eth0 |grep ' inet addr ' | Sed ' s/^.*addr://g ' | Sed ' s/bcast:.*$//g '

-------------------

192.168.100.74

-------------------

2. Use grep to remove the keyword man row

Copy the code code as follows:

# cat/etc/man.config |grep ' man '

Delete comment lines

Copy the code code as follows:

# cat/etc/man.config |grep ' man ' | Sed ' s/^#.*$//g '

Delete blank Lines

Copy the code code as follows:

# cat/etc/man.config |grep ' man ' | Sed ' s/^#.*$//g ' | Sed '/^$/d '

3. Use SED to replace each line in the Regular_express.txt.

Note: the-i parameter modifies the text directly, not the direct output

Copy the code code as follows:

# sed-i ' s/.*\.$/\!/g ' regular_express.txt

4. Use sed to add #This is a test to the last line of text

Note: $ represents the last line a represents a row after adding

Copy the code code as follows:

# sed-i ' $a #This is a test ' regular_express.txt

Change SELinux configuration file enforcing to Disabled

Copy the code code as follows:

# sed-i ' 6,6c selinux=disabled '/etc/selinux/config

Extended formal notation:

Copy the code code as follows:

# grep-v ' ^$ ' regular_express.txt |grep-v ' ^# '

Extended wording:

Copy the code code as follows:

# egrep-v ' ^$ ' | ^# ' Regular_express.txt

1. + represents repeating one or more of the previous re characters

Example: Egrep-n ' Go+d ' regular_express.txt

General wording: Grep-n ' goo*d ' regular_express.txt

2. Represents repeating 0 or one of the previous re characters

Example: Egrep-n ' Go?d ' regular_express.txt

3. | means to find out several strings in a way or

Example: Egrep-n ' Gd|good ' regular_express.txt

4. () indicates finding the group string

Example: Egrep-n ' G (la|oo) d ' regular_express.txt

That is, search (glad) or good these two strings

5. () + Multiple repeating group discrimination

For example: Echo ' axyzxyzxyzxyzc ' |egrep ' A (xyz) +c '

That is, to find the beginning is a end is a C in the middle there is more than one ' xyz ' string meaning

Awk:

1. Use last to remove the log data before the five elements

Copy the code code as follows:

# last-n 5

Remove account and login IP, and tab between account and IP

Copy the code code as follows:

# last-n 5 |awk ' {print $ \ t ' $ $} '

Note: Represents the first field separated by a space or tab, and so on.

$ A represents the entire field of the row

Copy the code code as follows:

# last-n 5 |awk ' {print $ \ t lines: "NR" \ t columes: "NF} '

Note: NF represents the total number of fields in each row

NR represents the first few rows of data that awk is currently in.

FS represents the target delimiter, which is the default space

2. In/etc/passwd: As a segmented character, we need to check the third column less than 10, and only list the account number and the third column

Copy the code code as follows:

# CAT/ETC/PASSWD | awk ' {fs= ': '} $3<10 {print ' \ t \ t \ ' $ $} '

Note: The query results do not show the first row of data because we define fs= ":" But only in the second row

To read the first line, you need to begin the keyword:

Copy the code code as follows:

# CAT/ETC/PASSWD | awk ' BEGIN {fs= ': '} $3<10 {print ' \ t \ t ' $ $} '

Df:

Compare the differences between two files:

Copy the code code as follows:

# diff/etc/rc3.d//etc/rc5.d/

-------------------

Only in/etc/rc3.d/: K30spice-vdagentd

Only in/etc/rc5.d/: S70spice-vdagentd

-------------------

Instance:

1. Statistics TCP Connection Status

Copy the code code as follows:

# Netstat-na | awk '/^tcp/{++s[$NF]} END {for (a in S) print A, s[a]} '

/^tcp/

Filter out lines that start with TCP, "^" for regular expression usage, ... First, this is to filter out lines that start with TCP.

S[]

An array named S is defined, and in awk, array subscripts usually start at 1 instead of 0.

The number of fields in the current record, separated by a space by default, as shown above, the number of NF fields equals

$NF

Represents the value of the last field of a row, such as the record shown above, $NF that is $6, which represents the value of the 6th field, that is, syn_recv or time_wait, and so on.

s[$NF]

The number of connections representing the values of the array elements, such as the record shown above, which is the s[time_wait] state

++s[$NF]

To add a number to a record, as shown above, is to add a s[time_wait] state to the number of connections

The result is a display of the final array value in the S array.

Example: s[time_wait]= final value s[testablished]= final value

END

For (key in S)

Traversal s[] Array

Print key, "\ T", S[key]

Prints the keys and values of the array, separated by the \ T tab in the middle, showing better.

This article is from "happy spicy small." "blog, declined reprint!"

grep, SED, awk real notes for the shell regex

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More