Note: The code and images in this document are fromsed and awk(second edition)
Operation of a text
SED is a "non-interactive" character stream-oriented editor, and AWK is a programming language responsible for pattern matching.
A typical example of this is converting data to a formatted report.
Understanding the basic operations of sed awk
Example one: file1.txt
John Daggett, 341 King Road, Plymouth MA
Alice Ford, East Broadway, Richmond VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
Terry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, Post Road, Sudbury MA
Hubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 6th Street, Boston MA
Replacing MA with Massachusetts
$ sed ' s/ma/massachusetts/' file1.txt
Using multiple Directives
$ sed ' s/ma/, massachusetts/; s/pa/, pennsylvania/' file1.txt
Or:
$ Sed-e ' s/ma/, massachusetts/'-e ' s/pa/, pennsylvania/'
File1.txt
Script: Sedsrc
s/ma/, massachusetts/
s/pa/, pennsylvania/
s/ca/,california/
s/va/, virginia/
s/ok/, oklahoma/
Using script files
$ sed-f sedsrc file1.txt
Save output
$ sed-f sedsrc file1.txt > Newfile.txt
Prevent input lines from appearing automatically
$ Sed-n ' s/ma/massachusetts/p ' file1.txt
Common error messages
"Does not match
s/src/dst/lack of final "/"
Using awk
Using script files
Awk-f script files
Print the first field of each line of the input file
$ Awk ' {print '} ' file1.txt
Print each line that matches this pattern
$ awk '/ma/' file1.txt
Limit output to only the first field of each record
$ Awk '/ma/{print '} ' file1.txt
changing separator characters
$ awk-f, '/ma/{print $ ' file1.txt
Use multiple commands, separated by a "semicolon"
$ awk-f, ' {print $; print $; print $} ' file1.txt
Common error messages
No curly braces {} Enclose the procedure
Don't use single quotes to "surround" instructions.
The regular expression is not enclosed with a slash//slashes.
Optiondescription
-F Filename of script follows.
-F Change Field separator.
-vvar=value follows.
Three understand the regular expression syntax Regular expression
Expression (pattern-matching)
An arithmetic expression:
1+23*5 1+2*3 (1+2)
A specific pattern:
ABCADC AEC ...
Ababb abbb abbbb abbbb ...
A regular expression describes a pattern or sequence of characters
The matching process of regular expressions
Metacharacters
. Matches any single character except newline characters, which can match line breaks in awk
* Match any one (including 0) characters in front of it
[...] Matches any one of the characters in the square brackets, ^ is a negative match,-represents the range of characters
^ as the first character of a regular expression, matches the beginning of the line. Line breaks can be embedded in awk
$ as the last character of the regular expression, matching the end of the line. Line breaks can be embedded in awk
\{n,m\} matches any number of times between N and M, and the \{n\} match occurs n times. \{n,\} matches at least
N Times Now
\ escape Character
Extended meta-characters
Extendedmetacharacters (Egrep and awk)
+ Match one occurrence or multiple occurrences of the preceding regular expression
? Matches 0 occurrences of the preceding regular expression or one occurrence
| Can match previous or subsequent regular expressions (alternatives)
() grouping Regular expressions
{N,m} matches the number of N to M occurrences, and the {n} match appears n times. {N,} matches appear at least n times and most awk is not supported for POSIX egrep and POSIX awk
3 steps to write a regular expression:
1 knows what to match and how it appears in the text.
2 Write a pattern to describe what to match
3 test mode to see what it matches
Results from pattern matching:
Hits (HIT)
This is the line I want to match.
Misses (Miss)
This is the line I don't want to match.
Omissions (omitted)
This is the line that I can't match but I want to match
Falsealarms (False alarm)
This is the line that I don't want to match, but it matches.
Character class
[Ww]hat
\. H[12345]
The range of characters
[A-z]
[0-9]
[Cc]hapter[1-9]
[-+*/]
[0-1] [0-9] [-/] [0-3] [0-9] [-/] [0-9] [0-9]
Exclude character classes
[^0-9]
Repeated occurrences of the character
10
50
100
500
1000
5000
[15]0*
[15]00*
The span of a character
* With \{n,m\}
Matching of phone numbers
[0-9]\{3\}-[0-9]\{7,8\}
Grouping operations
Compan (y|ies)
Note: Most sed and grep cannot match parentheses (), but in Egrep and
All versions of awk are available
Four writing sed scripts
Mode space
Sed-e ' s/pig/cow/'-e ' s/cow/horse/'
Global perspective on Addressing (addressable)
SED applies commands to each input line, which can specify 0, one, or two addresses. Each address
is a regular expression that describes a pattern, line number, or line addressing symbol.
Example File2.txt
. Ts
Beijing,cn
. TE
Shanghai,cn
Guangzhou,cn
Shenyang,cn
$ sed '/beijing/s/cn/china/' file2.txt
Delete all rows
D
Delete only the first row
1d
Delete the last line by using the addressing symbol $
$d
Delete empty lines, regular expressions must be enclosed in slash//
/^$/d
Delete. TBL input for TS and. TE tags
/^\. ts/,/^\.te/d
Delete all rows from line fifth to the end
5, $d
Mix line address and mode address
10
$ sed ' 1,/^$/d ' file2.txt
Delete rows other than those rows
1,5!d
grouping commands
/^\. ts/,/^\.te/{
s/cn/china/
s/beijing/bj/
}
Sed ' 2,3s{/cn/china/;s/a/b/} ' file.txt two substitutions of the same range can be enclosed in curly braces, with a semicolon in the middle
Five basic SED commands
The syntax of the SED command
[Address]command
The line address is optional for any command, it can be a pattern, or a regular expression enclosed by a slash, line number, or line addressing symbol, most SED commands can accept two addresses separated by commas, and some commands accept only a single row address
Commands can also be grouped with curly braces, and the first command can be placed on the same line as the curly braces.
But the closing brace must be on its own line
Replace
[Address]s/pattern/replacement/flags
Flag flags are:
n can be 1-512, which indicates that the nth occurrence is replaced
G Global Change
P Print Mode space content
Wfile write to a file
The replacement section will have a special meaning with the following characters:
& Replace with content matched by regular expressions
\ nthe callback parameter
$ cattest1
First:second
One:two
$ sed ' s/\ (. *\): \ (. *\)/\2:\1/' test1
Second:first
Two:one
Delete
[Address]d
Delete the contents of the schema space while changing the control flow of the script, after executing this command, in the "empty
"Mode space no longer has command execution. Deleting a command causes a new input row to be read