Shell Regular Expression Learning notes

Shell Regular Expression Learning notes _linux shell

Last Update:2017-01-18 Source: Internet

Author: User

Tags clear screen ming

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Formal representations (or regular representations) are arranged by special characters to search for/replace/delete one or more columns of text strings, and, simply put, formal notation is an "expression" used above the processing of strings. Formal notation is not a tool, but a standard basis for string processing, and if you want to handle strings in a formal notation, you have to use a tool program that supports regular representations, such as VI, SED, awk, and so on.

What is a regular expression?

A regular expression is a grammar rule that describes the character arrangement and matching pattern. It is mainly used for pattern segmentation, matching, lookup and substitution operations of strings.

Second, regular expressions and wildcard characters

1. Regular expressions

Used to match a qualified string in a file, and the regular expression is "include match." Commands such as grep, awk, and SED can support regular expressions.

2. Regular expression meta-characters

Regular expressions are matched by a character string, please refer to: http://www.cnblogs.com/refine1017/p/5011522.html

3. Wildcard characters

Used to match a qualifying file name, the wildcard character is "exact match." LS, find, CP These commands do not support regular expressions, so they can only be matched by using the shell's own wildcard characters.

4. Wildcard characters include

* Match any character

? Match any one character

[] matches any one of the characters in the brackets

Third, cut order

The cut command cuts bytes, characters, and fields from each line of the file and writes the bytes, characters, and fields to the standard output.

1. Common parameters

-B: Split in bytes. These byte locations ignore multibyte character boundaries, unless the-n flag is also specified.
-C: Split in characters.
-D: Custom delimiter, default to tab.
-F: Use with-D to specify which region to display.
-N: Cancels the split multibyte character. Used only with the-B flag.

2. Example 1: Print a row of a tab-separated file

[Root@localhost shell]# cat student.txt 
ID   Name  Gender Mark
1    Ming  F
2    Zhang  F
3    Wang  m
4    li   m
[root@localhost shell]# Cut-f 4 student.txt 
Mark
75

3. Example 2: Print a line of a CSV file

[Root@localhost shell]# cat student.csv 
id,name,gender,mark
1,ming,f,85
2,zhang,f,70
,
4,li,m,90
[root@localhost shell]# cut-d ","-F 4 student.csv 
Mark
75

4. Example 3: Print the first few characters of a string

[Root@localhost shell]# echo "ABCdef" | Cut-c 3

5. Example 4: Intercepting a text in a Chinese character

[Root@localhost shell]# echo "Shell Programming" | CUT-NB 1
S
[root@localhost shell]# echo "Shell Programming" | CUT-NB 2
h
[root@localhost shell]# echo "Shell Programming" |  CUT-NB 3
e
[root@localhost shell]# echo "Shell Programming" | CUT-NB 4
L
[root@localhost shell]# echo "Shell programming" | CUT-NB 5
l
[root@localhost shell]# echo "Shell Programming" | CUT-NB 8
Series
[root@localhost shell]# echo "Shell programming " | CUT-NB 11

Four, printf command

1. Command format

printf ' output type output format ' output

2. Output Type

%ns: Output string. n represents the output of several characters, and n ellipsis represents all characters

%ni: Output integer. n is the output of several numbers, n ellipsis represents all numbers

%M.NF: Output floating-point number. M and n are numbers that refer to the number of integers and decimal places that are output. For example,%8.2f represents a total output of 8 digits, of which 2 are small trees and 6 bits are integers.

3. Output format

\a: Output Warning sound

\b: Output backspace (Backspace)

\f: Clear Screen

\ n: Wrapping Line

\ r: Carriage return (enter)

\ t: Horizontal output BACKSPACE key

\v: Vertical Output BACKSPACE key

4. Example

[root@localhost ~]# printf '%i%s%i%s%i\n ' 1 ' + ' 2 ' = ' 3
1 + 2 = 3
[root@localhost ~]# printf '%i-%i-%i%i:%i:% I\n ' 2015 12 3 21 56 30

V. awk command

1. Command format

awk ' Condition 1{action 1} Condition 2{action 2} ... ' Filename

Conditions: General use of relational expressions as criteria, such as x > 10

Actions: Formatting output, Process Control statements

2. Example 1: Extract a row of a tab-separated file

[Root@localhost shell]# cat student.txt 
ID   Name  Gender Mark
1    Ming  F
2    Zhang  F
3    Wang  m
4    li   m
[root@localhost shell]# awk ' {print $ \ t ' $} ' student.txt 
ID   Mark
1
2
3
4

3. Example 2: Getting Disk utilization

[Root@localhost shell]# df-h
filesystem      Size Used avail use% mounted on
/dev/sda2       18G 2.4G  14G 15%/
/dev/sda1       289M  16M 258M  6%/boot
tmpfs         411M   0 411M  0%/dev/shm
[root@localhost shell]# df-h | grep "SDA1" | awk ' {print $} '

Six, sed order

SED is a lightweight flow editor that is almost included in all UNIX platforms, including Linux. SED is primarily used to select, replace, delete, and add data to the command.

1. Command format

sed [option] ' [Action] ' filename

2. Options

-N: The general sed command prints all the data to the screen, and if you add this option, only the rows that are processed by the SED command are exported to the screen.

-e: Allows multiple sed command edits to be applied to the input data.

-I: Use sed to modify the data directly to read the file, rather than by the screen output.

3. Action

A: Append, add one or more lines after the current line

C: Line substitution, replacing the original data row with the string followed by C

I: INSERT, insert one or more rows before the current line.

D: Delete, delete the specified line

P: Print, output the specified line

S: string substitution, replacing another string with one string. The format is "row range/s/old string/new string/g" (similar to the replacement format in vim)

4. Example

[Root@localhost shell]# cat student.txt ID Name Gender Mark 1 Ming F 2 Zhang F 3 Wang M 75 4 Li M 90# test-n parameter [root@localhost shell]# sed-n ' 2p ' student.txt 1 Ming F 85# test single-line Delete [root@localhost shell ]# sed ' 2d ' student.txt ID Name Gender Mark 2 Zhang F 3 Wang M 4 li M 90# test multiple-row deletion [Root@loca Lhost shell]# sed ' 2,4d ' student.txt ID Name Gender Mark 4 li M 90# testing append [root@localhost shell]# sed ' 2a test a   Ppend ' student.txt ID Name Gender Mark 1 Ming F test append 2 Zhang F 3 Wang M 4 li M 90# Test Insert [root@localhost shell]# sed ' 2i test insert ' student.txt ID Name Gender Mark Test insert 1 Ming F 8 
5 2 Zhang F 3 Wang M 4 li M 90# Test line replacement [root@localhost shell]# sed ' 2c test replace ' Student.txt ID Name Gender Mark test replace 2 Zhang F 3 Wang M 4 li M 90# test content replacement [root@localhost shell ]# sed ' 2s/ming/replace/G ' Student.txt ID Name Gender Mark 1 Replace F 2 Zhang F 3 Wang M 4 li M 90

Here are some examples of simple regular expressions that, through these examples, are more proficient at mastering basic regular Expressions:

HelloWorld matches 10 letters anywhere in any row: HelloWorld
^helloworld match 10 letters appearing at the beginning of the line: HelloWorld
The helloworld$ match appears at the end of the line at 10 letters: HelloWorld
^helloworld$ matches include only these 10 letters: a row of HelloWorld
[Hh]elloworld match HelloWorld or HelloWorld
The Hello.world match contains the 5 letters of Hello, plus any one character, plus the world
The Hello*world match contains the 5 letters of Hello, plus any letter, plus the world

In the example above, use the "." or "*" can match 0 or more characters, but if the character to match is a range, then "{}" is used, because "{" and "}" in the Shell have special meaning, so you need to use the transfer character "\", for example:
[Kouyang@kouyang Kouyang] # grep-n ' o\{2\} ' hello.txt
In the Hello.txt file, find the row where two consecutive "O" appear.

[Kouyang@kouyang kouyang]# grep-n ' go\{2, 5\}g ' Hello.txt
Find the line in the Hello.txt file that appears after 2~5 "O" followed by a word "g".

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More