How SED WORKS

Source: Internet
Author: User

It's better to look at a document than a tutorial. You guys working under Linux, you can use info sed to read from beginning to end, the effect is excellent.
Otherwise read some one-line command or feel baffled, retentive memory, or did not understand its principle come easy, hmm hum.
In the Info manual, section 3.1 describes how SED works.
SED uses two data buffer active _pattern_ space, the auxiliary _hold_ space. It was empty at the start of the order.
SED iterates through each line in the behavior unit to _pattern_space executes the command, but the newline character after each line is not read, so the newline character is automatically added when the output is entered.
After processing a line, usually (in some cases we can keep him), _pattern_ space is emptied, but _hold_ space is not emptied.
For each command, we can give it a specified range of execution (a row, or a range).
So, we should take note of two key points: scope, command
About the designation of the scope:
' Number ' an exact digit specifying a line
' First~step ' GNU sed extension, first= start step= step, want to take odd lines on one or twice, each five elements with a.
' $ ' last line
'/regexp/' satisfies the line of the regular expression [this is not the scope of our discussion]. If the regular expression contains '/', you need to use ' \ ' to escape.
' \%regexp% ', in order to avoid the above anti-citation, you can use any one symbol instead of '% '.
'/regexp/i ' \%regexp%i ' The GNU extension, which matches the regular pattern in a case-insensitive manner.
'/regexp/m ' \%regexp%m ' GNU extended multiline pattern matching regular. This time ' ^ ' indicates that there is a newline before it, and ' $ ' means following a newline followed.
The ' addr1,+n ' GNU extension, literal meaning, is well understood. The ADDR1 and its following n rows.
The ' addr1,~n ' GNU extension, found after ADDR1, continues to match the line until it is a multiple of N.
! The GNU extension, followed by an address range, is meant to be selected only if it does not match the address range expression.
Common commands:
# notes, I'm not used, so don't dwell on it, want to learn to read the manual yourself.
' Q [exit-code] GNU extension, only accepts an address range parameter, prints the current _pattern_ space after exiting, and returns the exit code.
' d Delete the current _pattern_ space and start processing the next line immediately.
' P prints the current _pattern_ space to standard output, which is typically used with the-n option
' N if automatic printing is closed by-N, this command prints the current _pattern_ space and then replaces the contents of the _pattern_ space with the next line, and stops executing the command if no input is entered.
' {COMMANDS} A series of command sets to '; ' Multiple commands are split and executed on _pattern_ space.
Here are the key commands.
---------------
' s replace command.
Command prototype: ' S/regexp/replacement/flags '. '/' can also be replaced with other symbols, such as: #%, etc. avoid '/' need to escape
The workflow is: s command commonly used regexp match _pattern_ space if successful, replace with replacement.
REGEXP can be grouped with ' \ (' \ '),
In replacement, you can use \1..9 to refer to,& to represent the entire matching content
Within the GNU extension you also use a set of ' \ ' with ' l ', ' l ', ' u ', ' u ', ' E ' sequences. But these are not often used.
' \l ' transforms the replacement to lowercase until ' \u ' or ' \e ' appears.
' \l ' turns the next character in replacement to lowercase.
' \u ' transforms the replacement to uppercase until ' \l ' or ' \e ' appears.
' \u ' turns the next character in replacement to uppercase.
' \e ' in the replacement End ' \l ', ' \u ' role.
FLAGS:
' G Replace all matching positions in _pattern_ space. Do not add this flag, only replace the position of the first match
' Number replaces the number match position in _pattern_ space. (POSIX does not define the use of number in conjunction with G What is the case, within the GNU, that is, from number to the last matching position)
' W FILENAME if the s command executes successfully, writes the output to the file. GNU extension supports the use of '/dev/stdout/dev/stderr '
' e GNU extension. If the S command succeeds, the command found in _pattern_ space will be executed, while _pattern_ space is replaced with the output of the command
' P print out the matching _pattern_ space. Also specify that the ' EP ' or ' PE ' effect is not the same. The PE will print the found command, while the EP just prints the output of E
' i,i GNU extension, in a case-insensitive manner to match the regular pattern.
' M,m GNU extended multiline pattern matching regular. This time ' ^ ' indicates that there is a newline before it, and ' $ ' means following a newline followed.
------------------------
' y/source_chars/dest-chars/' is replaced by the corresponding position of the character, such as y/abc/abc/where A is replaced by a b replaced by B
' A\text adds TEXT after the current loop output. Note that it's the back of the output, so if you turn off printing with the-n command, you'll see how it's not working. You can use escape characters.
The GNU extension is a property that ignores these sequences when there is a blank sequence such as ' \t\r\n ' in the middle of a and text. The same is true for the ' I ' C ' command.
' I\text Print TEXT immediately
' C\text Delete _pattern_ space, print TEXT
' = Print the current input is the first line, followed by a newline character
' L N ' clear mode print _pattern_ space. Try it yourself, the Chinese all become \232 \245 such a. End of Line plus $
' r FILENAME reads the file into the current loop after the output stream. The file name doesn't exist and it's OK, just read the empty one. GNU Extended Support/dev/stdin
' W FILENAME writes _pattern_ space to the file. The GNU extension supports/dev/stderr/dev/stdout. The file will be created (not present) or truncated (already present) before it is read. All W directives (including the W flag successfully executed by s) do not close or reopen the file (for increased efficiency)
' N add a line break to _pattern_ space and read the input of the next line at the same time
' d if _pattern_ space does not have a newline character, it is equivalent to the ' d ' command. Otherwise, the _pattern_ space content is deleted to the first newline character, and the loop is restarted without reading the next line.
' P print _pattern_ space to the first place where a newline character appears
' H replace _hold_ space with the contents of _pattern_ space
' H add a newline character after _hold_ space and copy the contents of _pattern_ space.
' G Replace the contents of _pattern_ space with the contents of _hold_ space
' G add a newline character to _pattern_ space and copy the contents of _hold_ space.
' X Exchange contents of _pattern_ space and _hold_ space
------------------
': Label sets a label
' B ' label unconditionally jumps to the label label. This tag will be ignored at the beginning of the next loop.
' t label s successfully executed after jumping to label label

How SED WORKS

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.