Formal representations (or regular representations) are arranged by special characters to search for/replace/delete one or more columns of text strings, and, simply put, formal notation is an "expression" used above the processing of strings. Formal notation is not a tool, but a standard basis for string processing, and if you want to handle strings in a formal notation, you have to use a tool program that supports regular representations, such as VI, SED, awk, and so on.
What is a regular expression?
A regular expression is a grammar rule that describes the character arrangement and matching pattern. It is mainly used for pattern segmentation, matching, lookup and substitution operations of strings.
Second, regular expressions and wildcard characters
1. Regular expressions
Used to match a qualified string in a file, and the regular expression is "include match." Commands such as grep, awk, and SED can support regular expressions.
2. Regular expression meta-characters
Regular expressions are matched by a character string, please refer to: http://www.cnblogs.com/refine1017/p/5011522.html
3. Wildcard characters
Used to match a qualifying file name, the wildcard character is "exact match." LS, find, CP These commands do not support regular expressions, so they can only be matched by using the shell's own wildcard characters.
4. Wildcard characters include
* Match any character
? Match any one character
[] matches any one of the characters in the brackets
Third, cut order
The cut command cuts bytes, characters, and fields from each line of the file and writes the bytes, characters, and fields to the standard output.
1. Common parameters
-B: Split in bytes. These byte locations ignore multibyte character boundaries, unless the-n flag is also specified.
-C: Split in characters.
-D: Custom delimiter, default to tab.
-F: Use with-D to specify which region to display.
-N: Cancels the split multibyte character. Used only with the-B flag.
2. Example 1: Print a row of a tab-separated file
[Root@localhost shell]# cat student.txt
ID Name Gender Mark
1 Ming F
2 Zhang F
3 Wang m
4 li m
[root@localhost shell]# Cut-f 4 student.txt
Mark
75
3. Example 2: Print a line of a CSV file
[Root@localhost shell]# cat student.csv
id,name,gender,mark
1,ming,f,85
2,zhang,f,70
,
4,li,m,90
[root@localhost shell]# cut-d ","-F 4 student.csv
Mark
75
4. Example 3: Print the first few characters of a string
[Root@localhost shell]# echo "ABCdef" | Cut-c 3
5. Example 4: Intercepting a text in a Chinese character
[Root@localhost shell]# echo "Shell Programming" | CUT-NB 1
S
[root@localhost shell]# echo "Shell Programming" | CUT-NB 2
h
[root@localhost shell]# echo "Shell Programming" | CUT-NB 3
e
[root@localhost shell]# echo "Shell Programming" | CUT-NB 4
L
[root@localhost shell]# echo "Shell programming" | CUT-NB 5
l
[root@localhost shell]# echo "Shell Programming" | CUT-NB 8
Series
[root@localhost shell]# echo "Shell programming " | CUT-NB 11
Four, printf command
1. Command format
printf ' output type output format ' output
2. Output Type
%ns: Output string. n represents the output of several characters, and n ellipsis represents all characters
%ni: Output integer. n is the output of several numbers, n ellipsis represents all numbers
%M.NF: Output floating-point number. M and n are numbers that refer to the number of integers and decimal places that are output. For example,%8.2f represents a total output of 8 digits, of which 2 are small trees and 6 bits are integers.
3. Output format
\a: Output Warning sound
\b: Output backspace (Backspace)
\f: Clear Screen
\ n: Wrapping Line
\ r: Carriage return (enter)
\ t: Horizontal output BACKSPACE key
\v: Vertical Output BACKSPACE key
4. Example
[root@localhost ~]# printf '%i%s%i%s%i\n ' 1 ' + ' 2 ' = ' 3
1 + 2 = 3
[root@localhost ~]# printf '%i-%i-%i%i:%i:% I\n ' 2015 12 3 21 56 30
V. awk command
1. Command format
awk ' Condition 1{action 1} Condition 2{action 2} ... ' Filename
Conditions: General use of relational expressions as criteria, such as x > 10
Actions: Formatting output, Process Control statements
2. Example 1: Extract a row of a tab-separated file
[Root@localhost shell]# cat student.txt
ID Name Gender Mark
1 Ming F
2 Zhang F
3 Wang m
4 li m
[root@localhost shell]# awk ' {print $ \ t ' $} ' student.txt
ID Mark
1
2
3
4
3. Example 2: Getting Disk utilization
[Root@localhost shell]# df-h
filesystem Size Used avail use% mounted on
/dev/sda2 18G 2.4G 14G 15%/
/dev/sda1 289M 16M 258M 6%/boot
tmpfs 411M 0 411M 0%/dev/shm
[root@localhost shell]# df-h | grep "SDA1" | awk ' {print $} '
Six, sed order
SED is a lightweight flow editor that is almost included in all UNIX platforms, including Linux. SED is primarily used to select, replace, delete, and add data to the command.
1. Command format
sed [option] ' [Action] ' filename
2. Options
-N: The general sed command prints all the data to the screen, and if you add this option, only the rows that are processed by the SED command are exported to the screen.
-e: Allows multiple sed command edits to be applied to the input data.
-I: Use sed to modify the data directly to read the file, rather than by the screen output.
3. Action
A: Append, add one or more lines after the current line
C: Line substitution, replacing the original data row with the string followed by C
I: INSERT, insert one or more rows before the current line.
D: Delete, delete the specified line
P: Print, output the specified line
S: string substitution, replacing another string with one string. The format is "row range/s/old string/new string/g" (similar to the replacement format in vim)
4. Example
[Root@localhost shell]# cat student.txt ID Name Gender Mark 1 Ming F 2 Zhang F 3 Wang M 75 4 Li M 90# test-n parameter [root@localhost shell]# sed-n ' 2p ' student.txt 1 Ming F 85# test single-line Delete [root@localhost shell ]# sed ' 2d ' student.txt ID Name Gender Mark 2 Zhang F 3 Wang M 4 li M 90# test multiple-row deletion [Root@loca Lhost shell]# sed ' 2,4d ' student.txt ID Name Gender Mark 4 li M 90# testing append [root@localhost shell]# sed ' 2a test a Ppend ' student.txt ID Name Gender Mark 1 Ming F test append 2 Zhang F 3 Wang M 4 li M 90# Test Insert [root@localhost shell]# sed ' 2i test insert ' student.txt ID Name Gender Mark Test insert 1 Ming F 8
5 2 Zhang F 3 Wang M 4 li M 90# Test line replacement [root@localhost shell]# sed ' 2c test replace ' Student.txt ID Name Gender Mark test replace 2 Zhang F 3 Wang M 4 li M 90# test content replacement [root@localhost shell ]# sed ' 2s/ming/replace/G ' Student.txt ID Name Gender Mark 1 Replace F 2 Zhang F 3 Wang M 4 li M 90
Here are some examples of simple regular expressions that, through these examples, are more proficient at mastering basic regular Expressions:
HelloWorld matches 10 letters anywhere in any row: HelloWorld
^helloworld match 10 letters appearing at the beginning of the line: HelloWorld
The helloworld$ match appears at the end of the line at 10 letters: HelloWorld
^helloworld$ matches include only these 10 letters: a row of HelloWorld
[Hh]elloworld match HelloWorld or HelloWorld
The Hello.world match contains the 5 letters of Hello, plus any one character, plus the world
The Hello*world match contains the 5 letters of Hello, plus any letter, plus the world
In the example above, use the "." or "*" can match 0 or more characters, but if the character to match is a range, then "{}" is used, because "{" and "}" in the Shell have special meaning, so you need to use the transfer character "\", for example:
[Kouyang@kouyang Kouyang] # grep-n ' o\{2\} ' hello.txt
In the Hello.txt file, find the row where two consecutive "O" appear.
[Kouyang@kouyang kouyang]# grep-n ' go\{2, 5\}g ' Hello.txt
Find the line in the Hello.txt file that appears after 2~5 "O" followed by a word "g".