Handle any type of data in a shell script, using the SED gwak
Automatically process text in a text file ...
=====
SED Editor:
Also known as the Flow Editor (stream editor), it is exactly the opposite of a normal interactive text editor. In an interactive editor such as VIM, you can use keyboard commands to interactively insert, delete, or replace text in the data. The Liu Editor will edit the data flow based on a set of predefined rules before the editor processes the data ...
The SED editor can process data in a data stream based on commands entered into the command line or stored in a command text file. It reads a row from the input, matches the data with the supplied editor command, modifies the data in the stream as specified in the command, and then prints the resulting data to the stdout. After the flow editor matches the command to a row of data, he reads the next line of commands and repeats the process ... When the flow editor finishes processing all the data in the stream, it terminates ...
So, the commands are all on one line, and you have to make changes to the text at once, so it's fast ...
SED format:
Sed options Script File
Options
-e script adds the command specified in the script to the running command when processing input
-F file adds the command specified in file to the running command when processing input
-n do not generate output for each command, wait for the print command to output ...
The script parameter specifies a single command that will act on the stream data. If you need to use more than one command, you must use the-e option to specify them on the command line, or use the-f option to specify them in a separate file. There are a number of commands that can be used to process data.
----------
1. Using sed at the command line
By default, the SED editor applies the specified command to the stdin input stream.
[Oh@localhost shell]$ Echo ' This is a test ' | Sed ' s/test/sed test/' is
a sed test
[oh@localhost shell]$
Here, I used the pipeline, SED, to use the S command: s/aa/bb/replaces all AA with BB
[Oh@localhost shell]$ cat Testfile This is the ' the ' is the ' the ' is the ' the ' is ' the ' is '
a test
ond
7/>this is the "End"
[oh@localhost shell]$ sed ' s/this/that/' testfile that's is the '
St Ond line This is the "third line" is the "end line
[Oh@localhost shell]$
It's very fast.
[Oh@localhost shell]$ cat Testfile This is the ' the ' is the ' the ' is the ' the ' is ' the ' is '
a test
ond
4/>this is the "End"
[oh@localhost shell]$ sed ' s/this/that/' testfile ' is ' the ' the ' the ' 's ' is the '
T ond line "that" is the "third line" is the "end"
[oh@localhost shell]$ cat testfile This is
t He-a-is-a-test ond line This is the ' third line ' is the ' end line
[Oh@localhost s hell]$
You can see that this does not affect the original file ... SED will only send the modified data to the STDOUT.
Now I want to use multiple SED statements on the command line:
Use the-e option
Sed-e ' s/bra/under/;s/asd/er/' file
[Oh@localhost shell]$ cat Testfile This is the ' the ' is the ' the ' is the ' the ' is ' the ' is '
a test
ond
4/>this is the "End"
[oh@localhost shell]$ sed-e ' s/this/that/s/is/are/' testfile that
are />thare is a test ond line that are the third line this are the end line
[Oh@localhost shell]$
The command is separated by a number, with no space between the head and the tail.
Of course you can also use the secondary prompt:> in the shell without using a semicolon;
[Oh@localhost shell]$ sed-e '
> s/this/that/
> S/is/are/
> '
testfile
testfile
Be sure to end the command on the ' No ' line. Bash Shell once the single quotation mark is closed ...
So the top is wrong.
[Oh@localhost shell]$ sed-e '
> s/th/aa/
> s/is/yu/
> ' testfile
aayu is ' >thyu is a test ond line aayu are the third line Aayu are the end line
[Oh@localhost shell]$
That's all.
=======
Read the edit command from a file
Which is to put a lot of sed processing commands in a file.
[Oh@localhost shell]$ cat sedd
s/this/that/
s/is/are/
[Oh@localhost shell]$
It's just so casual. There is no semicolon at the end of the command.
Then: Sed-f sedd testfile
[Oh@localhost shell]$ cat sedd
s/this/that/
s/is/are/
[oh@localhost shell]$ sed-f sedd the testfile that
Are the The "the" thare is a test ond line that are the third line that are the end line
[oh@local Host shell]$
=========
Introduce first: gawk
It provides a class programming environment that allows you to modify and rearrange the data in a file, more advanced than SED.
The GNU version of the original awk program from UNIX ... Gawk let the flow editor to a new level, no longer just command processing, but a programming language ...
Apply a lot to generate reports, format log files ...
Gawk Options Program File
Options
-F FS Specifies a field separator in a row that separates data fields
-F file Specifies the file name of the Read program
-V Var=value defines a variable in the Gawk program and its default value
-MF N Specifies the maximum number of fields in the file to be processed
-MR N Specifies the maximum number of data rows in the data file
-W keyword specifies gawk compatibility mode or warning level
----------
To use gawk from the command line:
Need to use curly braces {inside the command} ' to wrap {} '
When you: Gawk ' {print ' Hello Oh '} '
Print is a built-in command for Gawk
If this is the only way to run, Gawk will wait for the input from stdin until you send a signal that the stream is over: eof:end-of-file
Keyboard Press Ctrl+d
[Oh@localhost shell]$ gawk ' {print ' Hello Oh '} '
oh
hello oh
hi
Hello oh
aaaaaa
Hello Oh
[ Oh@localhost shell]$
---
To use a data field variable:
For data in one text, Gawk automatically assigns a variable to each element in each row. By default, variables are assigned as follows:
$ represents the entire line of text
The first data field in a text line
Second data field in a $ $ line of text
$n the nth data field in a line of text
Fields are divided by field delimiters ...
That is, the data in the text is processed by the gawk, and the default field delimiter is any white space character (such as a space or a tab)
[Oh@localhost shell]$ cat TF This is the ' the ' This is the ' the ' the ' is the '
S is the "End
" [Oh@localhost shell]$ gawk ' {print} ' tf this to this
ond
[ Oh@localhost shell]$
[Oh@localhost shell]$ cat TF This is the ' the ' This is the ' the ' the ' is the '
S is the "End
" [Oh@localhost shell]$ gawk ' {print} ' tf this to this
ond
[ Oh@localhost shell]$ gawk ' {print $} ' TF This is the "This is the" is the "the"
Hird line This is the "End
" [oh@localhost shell]$ gawk ' {print} ' tf this this
ond
This
is
[Oh@localhost shell]$
You can see the nth fields that output the entire file
Gawk-f: ' {print $} '/etc/passed
The delimiter specified is:-F:
[Oh@localhost shell]$ gawk-f: ' {print $} '/etc/passwd
root
bin
daemon
adm
...
nfsnobody
abrt
GDM
Tomcat
webalizer
sshd
mysql
tcpdump
oprofile
Oh
[Oh@localhost shell]$
Execute more than one command:
Use semicolons or >
[Oh@localhost shell]$ echo "My name is Oh" | Gawk ' {$4= ' HHH '; print $} ' My
name is HHH
[Oh@localhost shell]$ gawk ' {
> $4= ' ohhh '
> Print $} ' kkdfkds sjfksj sjfsklf fsfls// I entered
KKDFKDS sjfksj SJFSKLF ohhh//It output
ksjfkljj JJ JJ jj//I typed
ksjfkljj JJ JJ ohhh//it output
o o o//i entered the
o o O hhh//it output
[Oh@localhost shell]$
-------
To write a command in a file:
Cat ASD:
{
Test= "Oh Oh"
Print $ test $
}
Use: Gawk-f:-F ASD/ETC/PASSWD
No need to use the $ symbol, there are a lot of commands in a curly brace, no, just another line ...
-------
To run a script before processing data:
Gawk ' BEGIN {print ' Hello World '} '
Sometimes you might want to run a script before you work with data, such as creating the first part of the report ... The BEGIN keyword has this feature.
He will force Gawk to execute the program script specified after the BEGIN keyword before reading the data:
[Oh@localhost shell]$ gawk ' BEGIN {print ' Hello World '} '
Hello world
[Oh@localhost shell]$
Shows that Hello world will quickly exit without waiting for any data input. The line gawk command for the BEGIN keyword is used only to display text,
The script to process the data has to be written somewhere else ...
[Oh@localhost shell]$ cat TF This is the ' the ' This is the ' the ' the ' is the '
S is the end line
[Oh@localhost shell]$
Gawk ' BEGIN {print ' The data4 file contents: '} {print '} ' TF
Write with a {} but write in "'
Gawk ' BEGIN {print ' The data4 file contents: '} {print '} ' TF
[Oh@localhost shell]$ gawk ' BEGIN {print ' data4 file contents: '} {print $} ' tf the
data4 file contents:
th is
this
ond
this
---------
Since there are in the begin so naturally also have the end keyword ....
End is used to process data before running
[Oh@localhost shell]$ gawk ' BEGIN {print ' data4 file contents: "} {print} end {print" End of File '} ' tf the
dat A4 File contents: This is ond this is end of
file
[oh@localhost shell]$
A small example.
[Oh@localhost shell]$ gawk-f script1/etc/passwd The latest list of users and shells UserID Shell----------Root /bin/bash bin/sbin/nologin daemon/sbin/nologin adm/sbin/nologin lp/sbin/nologin Sync/bin/sync Bin/shutdown halt/sbin/halt mail/sbin/nologin uucp/sbin/nologin operator/sbin/nologin games/sbin/nologin Go Pher/sbin/nologin ftp/sbin/nologin nobody/sbin/nologin dbus/sbin/nologin usbmuxd/sbin/nologin rpc/sbin/n Ologin rtkit/sbin/nologin avahi-autoipd/sbin/nologin vcsa/sbin/nologin apache/sbin/nologin haldaemon/sbin/n
Ologin ntp/sbin/nologin saslauth/sbin/nologin postfix/sbin/nologin pulse/sbin/nologin rpcuser/sbin/nologin Nfsnobody/sbin/nologin abrt/sbin/nologin gdm/sbin/nologin tomcat/sbin/nologin webalizer/sbin/nologin sshd /sbin/nologin Mysql/bin/bash Tcpdump/sbin/nologin Oprofile/sbin/nologin Oh/bin/bash This concludes the Listin G [Oh@localhost shell]$Cat script1 BEGIN {print "The latest list of users and shells" print userid shell "print"----------"fs=": "} {
print ' $} end {print ' This concludes the listing '} [Oh@localhost shell]$
Assigning an FS variable to a script this is another way to define a field delimiter ...
As you can see, BEGIN is performed only once before text processing ...
End is processed only once after text has been executed
The middle command executes once for each line ...
----------
Back to the SED editor:
About the substitution command: s is the shorthand for the substitute command. But this is too simple ...
1. Replacement Mark:
Only the first string to match can be replaced by default ...
[Oh@localhost shell]$ echo "AA aa" |sed ' s/aa/bb/'
bb aa
[Oh@localhost shell]$
To replace the words, you have to use the replacement tag (substitution flag)
S/pattern/replacement/flags
There are four kinds of flag:
Number: The first few matches are replaced.
G: All matches to the content
W File: Writes the result of the substitution to the file ...
[Oh@localhost shell]$ echo "AA aa" |sed ' s/aa/bb/'
bb aa
[Oh@localhost shell]$ echo "AA" |sed ' s/aa/bb/2 '
AA BB
[Oh@localhost shell]$ echo "AA" |sed ' s/aa/bb/2 1 '
sed:-E expression #1, char 11:multiple number options T O ' command
[oh@localhost shell]$ echo "AA aa" |sed ' s/aa/bb/2,1 '
sed:-E expression #1, char 10:unknown option
to ' [Oh@localhost shell]$ echo ' AA ' |sed ' s/aa/bb/2;1 '
sed:-E expression #1, char 11:missing command
[o H@localhost shell]$
[Oh@localhost shell]$ echo "AA aa" |sed ' s/aa/bb/g '
bb bb
Output only replaced rows using p plus sed-n option;;-N Disables the SED editor output ... But P will output the modified rows ... This allows you to output only the rows that have been modified by the substitute command.
Normally, the output of sed is in stdout ... When you use W file, only rows containing matching patterns are saved to the specified output file.
2.
To replace some awkward characters:
When you want to replace the forward slash, you may encounter some problems, more trouble ...
have to use \
Sed ' s/\/etc/\/opt/'/etc/passwd
Poor readability ....
So, use. To replace the original/. As a string separator.
Sed ' s!/etc!/opt! '/etc/passwd
-------------
Use address ...
Row addressing (line addressing) when you don't want to match all the rows, but only certain rows.
There are two ways to address:
1. Range of numbers of rows.
2. Use text mode to filter the output line ...
Both of these are the following ways of ordering:
[Address] Command
Or:
Address {
Command1
Command2
Command3
}