Sed and AWK getting started with Sed

Source: Internet
Author: User

Sed and awk are * powerful text processing artifacts in Nix command lines. they are all row-oriented, or they process text in one row and one row. They read content from standard input or files and execute script commands in one row, then print the output to the standard output until the end of the file (EOF ).


Sed

Sed is a stream editor. It is used to edit and process an input stream. this is equivalent to script-based editing for an input stream. in fact, it is to edit the script of an input stream ed (a row-oriented editor.
The SED command consists of two parts: command line parameters or command execution methods, and editing commands, also known as scripts.

Command Execution method:

Sed [OPTIONS]-e 'scripts' |-f script-file [input-files]

For example:

[alex@alexon:~]$sed -n -e 'p' speech.txtWith great power comes great responsibility.The freedom is nothing but a chance to be better.

As you can see, this is equivalent to the cat command.

[alex@alexon:~]$cat speech.txt With great power comes great responsibility.The freedom is nothing but a chance to be better.
Command Line Parameters

For more information, see man's manual:

-N -- quiet -- silent
Do not automatically print the mode space. simply put, the current row to be processed is not automatically printed. sed reads a row and puts it in a Pattern space to facilitate the execution of the editing command to process it. by default, this line (Content in Pattern space) is automatically printed out. when this parameter is not specified in the comparison, you will understand:

[alex@alexon:~]$sed -e 'p' speech.txtWith great power comes great responsibility.With great power comes great responsibility.The freedom is nothing but a chance to be better.The freedom is nothing but a chance to be better.

The reason is that the first line is the content of the Pattern space printed by default (that is, the content of the row to be processed ). then run the edit command. Because the edit command is a simple p (print content), you will see the duplicate output.
But if-n (or -- quiet -- silent) is added, it will become like this:

[alex@alexon:~]$sed -n -e 'p' speech.txtWith great power comes great responsibility.The freedom is nothing but a chance to be better.

-E 'scripts'
Specifies the editing command, or a script. It is the Editing Command supported by sed to be executed. It mainly involves editing operations such as pattern matching, text replacement, insertion, and deletion.
This option can be specified. sed will be executed one by one in the order from left to right.

[alex@alexon:~]$sed -e '=' -e 'p' -e 's/great/poor/' speech.txt1With great power comes great responsibility.With poor power comes great responsibility.2The freedom is nothing but a chance to be better.The freedom is nothing but a chance to be better.

Resolution: The First Command '=' is to print the row number, the second command is to print this line, and the third command is to replace.

-F script-File
Execute the script in the specified file. That is, do not place the edit command in the command line, but in a file. Let sed execute the command in the file.
-I [suffix] -- in-place [= suffix]
Instantly edit the input file. if suffix is specified, it is used as the suffix to back up the input file. the default behavior is to read a line of text from the input file, execute the command, and then output the result to the standard output. That is to say, it has no effect on the original text and will not change the original file. but sometimes we want to change the original file, that is, to edit the original file. this option is required. to avoid data loss, you can specify a suffix to back up the original file.
For example:

[alex@alexon:~]$cat speech.txt With great power comes great responsibility.The freedom is nothing but a chance to be better.[alex@alexon:~]$sed -i.bak -e 's/great/poor/g' speech.txt [alex@alexon:~]$cat speech.txtWith poor power comes poor responsibility.The freedom is nothing but a chance to be better.[alex@alexon:~]$cat speech.txt.bak With great power comes great responsibility.The freedom is nothing but a chance to be better.

The command is to replace great in the file with poor, and back up the original file as. Bak.
So far, does it remind you of powerful Perl commands? It also has similar functions:

[alex@alexon:~]$perl -p -i.bak -e 's/poor/great/g' speech.txt[alex@alexon:~]$cat speech.txtWith great power comes great responsibility.The freedom is nothing but a chance to be better.[alex@alexon:~]$cat speech.txt.bak With poor power comes poor responsibility.The freedom is nothing but a chance to be better.


The command line parameter is only part of sed. Its main core part is its editing command or its script, which is specified through the-e option, or use the script file specified by-f.

Edit command format:

[Command Scope] [!] CMD [cmd-ARGs]

For example,

[alex@alexon:~]$sed -n -e '1 p' speech.txtWith great power comes great responsibility.
Command Scope

It can also be called addressing. In general, it is used to specify the scope of the edited command. There are several ways to specify the scope:

Not specified --- if no specific range is specified, all rows will be used.

[alex@alexon:~]$sed -n -e 'p' speech.txtWith great power comes great responsibility.The freedom is nothing but a chance to be better.

Use the row number to specify --- n, m to the nth row. Special $ indicates the last row.
1, 3 ---- 1st rows to 3rd rows
1, $ ---- 1st rows to the last row, that is, all rows
The relative number of rows. + M, for example, N, + M, can be followed by a comma to indicate a relative number of rows from N to N + M, for example:

[alex@alexon:~]$sed -n -e '2,+3 p' speech.txt2. The freedom is nothing but a chance to be better.3. The tree of liberty must be refreshed from time to time with blood of patroits4. and tyrants.5. Life is like a box of chocolates, you never know what you gonna get.


Hop Selection Line. ------- you can use the wave ~ To make a jump, n ~ M indicates that the execution starts from line N and is performed once every line m, for example, 1 ~ 2. Starting from row 3, execute once every two rows, that is, execute 1, 3, 5, 7 .....:

[alex@alexon:~]$sed -n -e '1~2 p' speech.txt1. With great power comes great responsibility.3. The tree of liberty must be refreshed from time to time with blood of patroits5. Life is like a box of chocolates, you never know what you gonna get.


Pattern Matching
The most powerful aspect of specifying a range is that you can use pattern matching to specify the format of. Pattern Matching:

[/Pattern1/], [/pattern2/]

If only one match is specified, the edit command is executed for all matched behaviors. If two matches are specified, the first line matches pattern1 and the first line matches pattern2.

[alex@alexon:~]$sed -n -e '/great/ p' speech.txt1. With great power comes great responsibility.[alex@alexon:~]$sed -n -e '/great/, /chocolates/ p' speech.txt1. With great power comes great responsibility.2. The freedom is nothing but a chance to be better.3. The tree of liberty must be refreshed from time to time with blood of patroits4. and tyrants.5. Life is like a box of chocolates, you never know what you gonna get.


Regular Expression

Regular Expressions are involved in pattern matching. The regular expressions of SED are slightly different from those of standard expressions. You can also specify-r -- Regexp-extended regular expressions in the command line to use extended regular expressions.

Location character:

^ --- Beginning of Line
$ ---- End of line
. ---- Any non-linefeed '\ N'
\ B ---- end of a word. A word is defined as a series of letters or numbers. It can be placed at either end or two ends.

Limit character

* --- Zero or one or more
\ + --- One or more
\? --- 0 or 1
{M} --- M appears
{M, n} --- appears m to n times. For example, {} indicates 1 to 5 times (1, 2, 3, 4, 5 times)

Escape Character

\ --- Escape special characters

Character Set

[] --- Any character in it

Operator

\ | ---- Or operation, ABC \ | 123 matches 123 or ABC
\ (... \) ---- Combination to form a group, mainly used for Indexing
\ N ---- the nth combination above,/\ (123 \) \ 1/matches 123123

Edit command

The text editing command is also very familiar with addition, insertion, replacement and deletion, and other such as printing, printing line numbers, etc.

Add1 [, Add2] I textInsert --- insert text before the specified row

[alex@alexon:~]$sed -e '3 i abcd' speech.txt1. With great power comes great responsibility.2. The freedom is nothing but a chance to be better.abcd3. The tree of liberty must be refreshed from time to time with blood of patroits4. and tyrants.5. Life is like a box of chocolates, you never know what you gonna get.

Add1 [, Add2] a textAdd --- add text behind the specified row

[alex@alexon:~]$sed -e '3 a abcd' speech.txt1. With great power comes great responsibility.2. The freedom is nothing but a chance to be better.3. The tree of liberty must be refreshed from time to time with blood of patroitsabcd4. and tyrants.5. Life is like a box of chocolates, you never know what you gonna get.

Add1 [, Add2] dDelete --- delete a specified row

[alex@alexon:~]$sed -e '3 d' speech.txt1. With great power comes great responsibility.2. The freedom is nothing but a chance to be better.4. and tyrants.5. Life is like a box of chocolates, you never know what you gonna get.

Add1 [, Add2] S/pattern/replace/[opts]Replace pattern in the specified row with replace

[alex@alexon:~]$sed -e '1, 3 s/great/poor/' speech.txt1. With poor power comes great responsibility.2. The freedom is nothing but a chance to be better.3. The tree of liberty must be refreshed from time to time with blood of patroits4. and tyrants.5. Life is like a box of chocolates, you never know what you gonna get.

By default, only the 1st pattern entries in the row are replaced.
Opts, you can specify options to control the replacement Behavior
N --- Replace the nth pattern in the row with replace
G --- replace all pattern in the row with replace
P --- print this line, if it is replaced successfully.
Add1 [, Add2] C textReplace the specified row with text

[alex@alexon:~]$sed -e '1, 3 c abcd' speech.txtabcd4. and tyrants.5. Life is like a box of chocolates, you never know what you gonna get.

PPrint

=Print row number


If you know this, you can handle most of the text processing. sed also has some advanced editing commands, such as operating Pattern Space or branches, but they are complicated and usually not used.

We can see that sed is a stream editor. Its strength is that it can be scripted to process text in line. its main function is to delete, query, change and Add. but it is not a programming language after all, so it cannot have variables, loops, branches, and other logic. therefore, sed is usually used with AWK. AWK is more of a programming language. They complement each other and constitute two powerful tools for text processing.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.