The shell regular expression grep, sed, awk

The shell regular expression grep, sed, awk _linux shell

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently has been studying the shell script this piece, just idle down the whole under their own on the better information in some examples, the following is my brother of the birds in the regular expression inside the inside of the comparison of some of the basic grammar, suitable for novice inspection.

First copy a sample:

Copy Code code as follows:

# VI Regular_express.txt
-------------------------------
' Open Source is ' a good mechanism to develop programs.
Apple is my favorite food.
Football game isn't use feet only.
This is dress doesn ' t fit me.
However, this dress is about $3183 dollars.
The GNU is free air beer.
Her hair is very beauty.
I can ' t finish the test.
Oh! The soup taste good.
Motorcycle is cheap the than car.
This window is clear.
The symbol ' * ' is represented as start.
Oh! My god!
The GD software is a library for drafting programs.
You are are the mean you are the No. 1.
The world <Happy> are the same with "glad".
I like dog.
Google is the best tools for search keyword.
Goooooogle yes!
Go! Go! Let ' s go.
# I AM Vbird
--------------------------------

Set up a language for C

Copy Code code as follows:

#export Lang=c

Grep

1. Search for a specific string "the"
Note: N is the display line number

Copy Code code as follows:

# grep-n ' the ' regular_express.txt

2. Reverse search for specific string "the"

Copy Code code as follows:

# grep-vn ' the ' regular_express.txt

3. Get the string of "the" in any case

Copy Code code as follows:

# grep-in ' the ' regular_express.txt

4. Use parentheses [] to search for collection characters
Search for test or taste these two words, found that they have a common ' t?st ', so you can search

Copy Code code as follows:

# grep-n ' t[ae]st ' regular_express.txt

This is actually looking for the two separate characters of T[a]st and T[e]st.
If you search for an OO character, you can use:

Copy Code code as follows:

# grep-n ' oo ' regular_express.txt

If you do not want to search oo with g in front of OO, we can use the reverse selection [^] to achieve:

Copy Code code as follows:

# grep-n ' [^g]oo ' Regular_express.txt

If you do not want to have lowercase characters in front of the search oo, then:

Copy Code code as follows:

# grep-n ' [^a-z]oo ' Regular_express.txt

Note: uppercase/lowercase english/numerals can be written in the form of [a-z]/[a-z]/[0-9], or they can be written together.
[A-za-z0-9] indicates that the requirement string is numeric and English
If we want to get the line that has numbers, then:

Copy Code code as follows:

# grep-n ' [0-9] ' regular_express.txt

Note: However, given the influence of the language family on the encoding sequence, in addition to using the minus sign [-] for continuous coding, you can use [: lower:] Instead of a-Z and [:d igit:] instead of 0-9

Copy Code code as follows:

# grep-n ' [^[:lower:]]oo ' Regular_express.txt
# grep-n ' [[:d igit:]] ' regular_express.txt

5. Display the string beginning as ' the '

Copy Code code as follows:

# grep-n ' ^the ' regular_express.txt

Show the beginning of the line is lowercase characters

Copy Code code as follows:

# grep-n ' ^[a-z] ' regular_express.txt

6. Show end of line as point. In that line

Copy Code code as follows:

# grep-n ' \.$ ' regular_express.txt

7. Display 5-9 rows of data

Copy Code code as follows:

# Cat-an Regular_express.txt |head-n |tail-n 6

8. Show Blank Lines

Copy Code code as follows:

# grep-n ' ^$ ' regular_express.txt

9. Find G?? D string, beginning G end d of four strings

Copy Code code as follows:

# grep-n ' G. d ' Regular_express.txt

O* represents the null character (that is, there are no characters) or one to n o characters, so grep-n ' o* ' regular_express.txt will print all the lines.
11.oo* represents o+ or one to n o characters, so grep-n ' oo* ' regular_express.txt will print all the lines of o,oo,ooo and so on.
"Goo*g" represents Gog,goog,gooog ... Wait

Copy Code code as follows:

# grep-n ' Goo*g ' regular_express.txt

13. Find the line containing the G...G string
Note:. * represents any character,. * is a null character or a to n any character

Copy Code code as follows:

# grep-n ' G.*g ' regular_express.txt

14. Find the line that contains the numbers

Copy Code code as follows:

# grep-n ' [0-9][0-9]* ' Regular_express.txt

or # grep-n ' [0-9] ' regular_express.txt

15. Find a string containing two O
Note: {} because there is a special meaning in the shell, you need to add a jump off character \ to make it meaningless

Copy Code code as follows:

# grep-n ' o\{2\} ' regular_express.txt

Find the G-string that contains 2 to 5 O and then ends with G.

Copy Code code as follows:

# grep-n ' Go\{2,5\}g ' regular_express.txt

Find a string with a G that contains more than 2 O and then ends with G.

Copy Code code as follows:

# grep-n ' Go\{2,\}g ' regular_express.txt

Summarize:
^word represents the string to be searched (word) at the beginning of the line
word$ indicates a string with search (word) at the end of a line
. Represents 1 arbitrary characters
\ represents the escape character, which adds the original special character before the special character is removed
* Indicates repetition of 0 to infinity before a re (regular expression) character
[List] Indicates a search for a string containing a list
[N1-N2] Indicates a search for the specified string range, such as [0-9] [A-z] [a-z], and so on
[^list] represents the range of the reverse string, for example [0-9] for non-numeric characters, [A-z] for non-uppercase character ranges
\{n,m\} indicates finding N to m first re characters
\{n,\} represents n more than one previous re character
Egrep Summary:
+ indicates repetition of one or more of the previous re characters
Example: Egrep ' Go+d ' regular_express.txt
Expression Search (God) (good) (Goood) ... And so on string, o+ represents [more than one o]
? Represents a repeat of the previous re character of 0 or one
Example: Egrep ' Go?d ' regular_express.txt
Indicates search (GD) (God) string, O? representative [empty or 1 O]
Note: The result set of ' go+d ' and ' go?d ' under Egrep equals the ' go*d ' under grep
| Indicates how many strings are found in or (or)
Example: Egrep ' Gd|good|dog ' regular_express.txt
Represents a search (GD) or (God) or (God) string, |
() indicates that the group string is found
Example: Egrep ' G (la|oo) d ' regular_express.txt
Represents a search (glad) or (good) string
() + indicates identification of multiple repeating groups
Example: Echo ' axyzxyzxyzxyzxyzc ' |egrep ' A (xyz) +c '
Indicates that the search begins with a end of a C, with more than one ' xyz ' string in the middle.

Sed:

Insert:

1. List the contents of the/etc/passwd and print the line number, and delete 2-5 lines to display

Copy Code code as follows:

# NL/ETC/PASSWD | Sed ' 2,5d '

Note: sed is shorthand for sed-e, followed by single quotes
Ditto Delete line 2nd

Copy Code code as follows:

# NL/ETC/PASSWD | Sed ' 2d '

Ditto delete third line to last line

Copy Code code as follows:

# NL/ETC/PASSWD | Sed ' 3, $d '

2. Add a row of test after the second line

Copy Code code as follows:

# NL/ETC/PASSWD | SED ' 2a test '

Add a row before the second line test

Copy Code code as follows:

# NL/ETC/PASSWD | Sed ' 2i test '

Add two rows to test after the second line

Copy Code code as follows:

# NL/ETC/PASSWD | SED ' 2a test \
> Test '

Replace line:

3. Replace 2-5 lines of content with No 2-5 number

Copy Code code as follows:

# NL/ETC/PASSWD | Sed ' 2,5c No 2-5 number '

4 list 第5-7 lines in/etc/passwd

Copy Code code as follows:

# nl/etc/passwd |sed-n ' 5,7p '

Replace string:

Sed ' s/replaced string/new string/g '

1. Get the row for native IP

Copy Code code as follows:

#/sbin/ifconfig eth0 |grep ' inet addr '

Remove the front portion of the IP

Copy Code code as follows:

#/sbin/ifconfig eth0 |grep ' inet addr ' | Sed ' s/^.*addr://g '

Remove the part after IP

Copy Code code as follows:

#/sbin/ifconfig eth0 |grep ' inet addr ' | Sed ' s/^.*addr://g ' | Sed ' s/bcast:.*$//g '
-------------------
192.168.100.74
-------------------

2. Use grep to remove the keyword man's line

Copy Code code as follows:

# cat/etc/man.config |grep ' man '

Delete Comment Line

Copy Code code as follows:

# cat/etc/man.config |grep ' man ' | Sed ' s/^#.*$//g '

Delete blank line

Copy Code code as follows:

# cat/etc/man.config |grep ' man ' | Sed ' s/^#.*$//g ' | Sed '/^$/d '

3. Use SED to replace each line in the Regular_express.txt with a.
Note: the-i parameter modifies the text directly instead of the direct output

Copy Code code as follows:

# sed-i ' s/.*\.$/\!/g ' regular_express.txt

4. Add #This is a test using SED in the last line of text
Note: $ represents the last row after a is added

Copy Code code as follows:

# sed-i ' $a #This is a test ' regular_express.txt

Change the SELinux configuration file enforcing to Disabled

Copy Code code as follows:

# sed-i ' 6,6c selinux=disabled '/etc/selinux/config

Extended Regular Representations:

Copy Code code as follows:

# grep-v ' ^$ ' regular_express.txt |grep-v ' ^# '

Extend the wording:

Copy Code code as follows:

# egrep-v ' ^$ ' | ' ^# ' Regular_express.txt

1. + means to repeat one or more of the previous re characters

For example: Egrep-n ' Go+d ' regular_express.txt
General wording: Grep-n ' goo*d ' regular_express.txt

2.? means to repeat 0 or one of the previous re characters

For example: Egrep-n ' Go?d ' regular_express.txt

3. | Identify a number of strings in the same way

For example: Egrep-n ' Gd|good ' regular_express.txt

4. () indicates that the group string is found

For example: Egrep-n ' G (la|oo) d ' regular_express.txt
That is, search (glad) or good these two strings

5. () + multiple repeating group discriminant

For example: Echo ' axyzxyzxyzxyzc ' |egrep ' A (xyz) +c '

Which is to find the beginning of a end is C with more than one ' xyz ' string meaning

Awk:

1. With the last to remove the log data before the five elements

Copy Code code as follows:

# last-n 5

Remove account and login IP, and the account and IP tab separated

Copy Code code as follows:

# last-n 5 |awk ' {print $ \ t ' $} '

Note: Represents the first field separated by a space or tab, and so on.
$ represents all fields in the row

Copy Code code as follows:

# last-n 5 |awk ' {print $ \ t lines: ' NR ' \ t columes: ' NF} '

Note: NF represents the total number of fields per line of $
NR represents the first few lines of data that awk is currently in.
FS represents the destination separator, and the default is a space

2. In the/etc/passwd to: as a segmented character, we have to look at the third column less than 10 below the data, and only the account number and the third column

Copy Code code as follows:

# CAT/ETC/PASSWD | awk ' {fs= ': '} $3<10 {print ' \ t \ t ' $} '

Note: The query results do not display the first row of data, because we have defined the fs= ":" But only in the second row to take effect
To read the first line, you need to begin this keyword:

Copy Code code as follows:

# CAT/ETC/PASSWD | awk ' BEGIN {fs= ': '} $3<10 {print ' \ t \ t ' $} '

Df:
Compare the differences between two files:

Copy Code code as follows:

# diff/etc/rc3.d//etc/rc5.d/
-------------------
Only in/etc/rc3.d/: K30spice-vdagentd
Only in/etc/rc5.d/: S70spice-vdagentd
-------------------

Instance:
1. Statistics TCP connection state

Copy Code code as follows:

# Netstat-na | awk '/^tcp/{++s[$NF]} end {for (a in S) print A, s[a]} '
/^tcp/

Filters out rows that start with TCP, and "^" is the regular expression usage ... Begins with a filtered line that starts with TCP.
S[]
Defines an array named S, in awk, where the array subscript usually starts at 1 instead of 0.
Nf
The number of fields in the current record, separated by a space by default, as shown on the record, the number of NF fields equals
$NF
Represents the value of the last field in a row, such as the record shown above, $NF that is $ $, which represents the value of the 6th field, i.e. Syn_recv or time_wait.
s[$NF]
Represents the value of an array element, such as the record shown above, the number of connections in the s[time_wait] state
++s[$NF]
To add a number to a, as shown in the record, is to put the s[time_wait] state of the connection number plus a
The result is to display the final array value in the S array
Example: s[time_wait]= final value s[testablished]= final value
End
For (key in S)
Traversal s[] Array
Print key, "\ T", S[key]
Prints the key and value of the array, and the middle is split with a \ t tab, showing better.

PS: About the regular, the site also provides 2 very simple and practical regular expression online tools for your reference to use:

JavaScript Regular expression on-line test tool:http://tools.jb51.net/regex/javascript

Regular Expression online generation tool:Http://tools.jb51.net/regex/create_reg

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More