"Linux Command Line and shell script programming Daquan" sed advanced multi-line command sed Editor contains three commands that can be used to process multi-line text 1.N: Add the next line in the data stream to create multiple row group to process 2.D: delete a row in a multi-row group 3.P: print the next command n command line of the next command line in the multi-row group will tell the sed editor to move to the next row in the data stream, instead of re-running the command at the beginning. Generally, the sed editor will execute all the defined commands on this line before moving to the next text line in the data stream. The next command in a single line has changed this process [plain] $ cat test.txt line1 line2 line3 line4. If we want to delete the blank lines under line1, We can first find line1, delete the following line [plain] $ sed-I '/line/{n; d}' test.txt $ cat test.txt line1 line2 line3 line4 and find that, the lines under line1 are deleted, and the blank lines under line2 and line3 are also deleted. After sed executes the n command, it still needs to continue the line-by-line scan and then perform operations on the rows that meet the requirements. So we have to match the exact point [plain] $ sed-I '/line1/{n; d}' test.txt suzhaoqiang @ suzhaoqiang-OptiPlex-380 :~ /Android/source/linux_learned $ cat test.txt line1 line2 line3 line4 merge a single line of text line next command will move the next line in the data stream to the workspace of the sed Editor (pattern space) the multi-line next command adds the next line to the text already in the mode space, text lines are still separated by line breaks [plain] $ cat test.txt line1 line2 line3 line4 line5 and the first two lines are merged [plain] $ sed-I '/line1/{N; s/\ n //} 'test.txt $ cat test.txt line1 line2 line3 line4 line5: [plain] $ sed-I '{N; s/\ n //} 'test.txt multi-line deletion command can be used with N and d. Note the following results.: [Plain] $ cat test.txt line1 line2 line3 line4 line5 $ sed 'n; /line3/d 'test.txt line1 line2 line5 found that both line3 and line4 in the mode space are deleted by sed and D is also provided, you can only Delete the first row of the mode space. d: Clear the mode space [plain] $ sed 'n; /line3/D 'test.txt line1 line2 line4 line5 multi-line printing command p print mode space the first line of the P print mode space [plain] $ cat test.txt line1 line2 line3 line4 line5 $ sed- n'n; /line/p 'test.txt line1 line2 line3 line4 $ sed-n 'n';/line/P' test.txt line1 line3 P only prints the first line in the mode space, P print out all content in the mode space. The mode space is an active buffer of sed, but it is not the only space in the sed editor to save text. sed has a hold space). You can use the reserved space to temporarily save some rows in the processing mode space. Sed's reserved space command description h copy the mode space to the reserved space H attach the mode space to the reserved space g will keep the space copied to the mode space G will keep the space attached to the mode space x Exchange and hold the content of space and mode space is an exercise, extract the first two lines of text and combine them into line1, line2, and line1 form [plain] $ cat test.txt line1 line2 line3 line4 line5 line6 $ sed-n'/line1/{h; p; n; p; g; p} 'test.txt line1 line2 line1 first finds the first row, copies the first row to the reserved space, and then prints the first row. Read the second row, print the second row, copy the reserved space to the mode space, and print the mode space again. We can simplify the process [plain] $ sed-n '1 {h; N; G; p} 'test.txt line1 line2 line1 addressing to find the first line, copy the first row to the reserved space, read the second row and append it to the mode space, append the reserved space to the mode space, and print it together. The exclusion command exclamation mark command is used to exclude commands, so that the commands that would originally work do not work (I think this sentence in the book will only mislead people ......) Man wrote After the address (or address-range), and before the command,! May be inserted, which specifies that the command shall only be executed if the address (oraddress-range) does not match. after the address (or a range), you can insert an exclamation mark (!) before the command (!), It specifies that this command can only run outside the address (range. Man makes it clearer. The row [plain] $ cat test.txt line1 line2 line3 line4 line5 line6 $ sed-n '/[1-3]/is not printed. P'test.txt line4 line5 line6 let's look at a complex example and reverse the text 1) First sed reads the first row to the mode space and then copies it to the hold space 2) then sed reads the second row to the mode space and appends the reserved space to the mode space (the mode space is now the second row and the first row ), copy the mode space to the hold space and repeat Step 2. Print [plain] $ sed-n'1! G; h; $ p 'test.txt line6 line5 line4 line3 line2 line1 $ tac test.txt line6 line5 line4 line3 line2 line1 use sed for exercises, otherwise, we can use tac [plain] $ tac test.txt line6 line5 line4 line3 line2 line1. After all, sed needs to read all the rows before printing, sed should not be used to change the circulation frequency. The sed editor will execute commands from the top of the script and process them until the end of the script (D exception, it will force the sed editor to return to the top of the script, instead of reading a new line, the jump (branch) format is as follows: [address] B [label] is similar to the exclamation point. B can be used in a range to define the position to jump, if no label exists, B will jump to the end of the script. [Plain] $ sed '2, 3b; s/line/; s/Line/lines/'test.txt lines1 line2 line3 lines4 lines5 lines6 the script above does not define a label for B, therefore, both replacement commands are executed. The following is an example of reversing text: [plain] $ sed-n'1b test; G;: test h; $ p 'test.txt line6 line5 line4 line3 line2 line1 above defines a label for B as test. If it is the first line, execute h; $ p; otherwise execute G; h; $ p we can also use the B command to implement a simple loop. The following is an example of removing the comma: [plain] $ echo "This, is, a, test, to, remove, commas. "| sed-n' {: start; s/, // p; B start} 'This is, a, test, to, remove, commas. this is a, test, to, remove, commas. this is a test, to, remove, commas. this is a test to, remove, Commas. this is a test to remove, commas. this is a test to remove commas. label can be defined before B. Now this example is easy to understand. Find the comma, delete it, print it, and jump to the beginning of the script for repeated execution. After all the commas (,) are deleted, the cycle is still not stopped. Therefore, you need to find a way to stop the search after the task is executed. Before repeat, we can find a comma. If there is a comma, it will be repeated. [Plain] $ echo "This, is, a, test, to, remove, commas. "| sed-n' {: start; s/, // p;/,/B start} 'This is, a, test, to, remove, commas. this is a, test, to, remove, commas. this is a test, to, remove, commas. this is a test to, remove, commas. this is a test to remove, commas. this is a test to remove commas. now it will not go into an endless loop. But now there is another question: What should we do if we only want the final result? That is to say, we cannot print every cycle, and we only need to print it when the task is executed. [Plain] $ echo "This, is, a, test, to, remove, commas. "| sed-n' {: start; s/, //;/,/B start; p} 'This is a test to remove commas. the test command is similar to the jump command. The test command (t) jumps to a tag Based on the output of the replace command, instead of an address-based jump to a tag. If no tag is specified, sed jumps to the end of the script if the test succeeds. [Plain] $ cat test.txt line1 line2 line3 line4 line5 line6 $ sed's/line [45]/Line/; t; s/line [0-9]/line/'test.txt line Line line if the first replacement command is successfully replaced, the command after t will not be executed. Otherwise, the command will be executed. Below is an example of deleting a comma: [plain] $ echo "This, is, a, test, to, remove, commas. "| sed-n' {: start; s/, //; t start; p} 'This is a test to remove commas. the replace and symbol (&) in the mode is similar to the reverse reference in the regular expression. & Indicates the text that matches the match pattern in the replacement command. [Plain] $ cat test.txt line1 line2 line3 line4 line5 line6 $ sed's/[0-9]/0 &/'test.txt line01 line02 line03 line04 line05 line06 Replace the words below the content is exactly the same as the reverse reference in the java regular expression. [Plain] $ sed's/\ ([0-9] \)/0 \ 1/'test.txt line01 line02 line03 line04 line05 line06 different from java regular expressions, the brackets must be escaped, which is the opposite of the regular expression. Next let's take a look at how to add the thousands separator [plain] $ echo "1234567" | sed ': start; s /\(. * [0-9] \) \ ([0-9] \ {3 \} \)/\ 1, \ 2 /; t start '1970 $ echo "1,234,567" | sed ': start; s /\(. * [0-9] \) \ ([0-9] \ {3 \} \)/\ 1, \ 2 /; t start '1970 $ echo "12,345,678" | sed ': start; s /\(. * [0-9] \) \ ([0-9] \ {3 \} \)/\ 1, \ 2/; t start' 123,456,789 code looks quite scary, the parentheses and curly braces must be escaped. The escaped parentheses are equivalent to grouping, but they are referenced by backslash and numbers. After the curly braces are escaped, they are equivalent to quantifiers: search for consecutive numbers from the beginning, the first group, the last three groups, then add a comma between the two groups, and then cycle Segment logic. Because parentheses are added before the last three digits, the last three digits are not found in this search, the second group in the two groups is the last 4th, 5, and 6 digits. Let's take a look at how to implement this function in java: [java] String reg1 = "\ d (? = (\ D {3}) + $) "; System. out. println ("123456789 ". replaceAll (reg1, "$0,"); System. out. println ("12345678 ". replaceAll (reg1, "$0,"); System. out. println ("1234567 ". replaceAll (reg1, "$0,"); output: 123,456,789, 6781,234,567 use the sed packaging script in the script to wrap the previously written script [plain] $ cat sed_test #! /Bin/bash sed-n' 1! G; h; $ P' "$1" facilitates later use of [plain] $ cat test.txt line1 line2 line3 line4 line5 line6 $ sed_test test.txt line6 line5 line4 line3 line2 line1 redirected sed output by default, sed will output the result to STDOUT. We can use reverse quotation marks in shell to direct the output to the variable. Create sed utility double line spacing [plain] $ cat test.txt line1 line2 line3 $ sed '$! G'test.txt line1 line2 line3 maintains only one blank row by default. We can append it to the mode space to multiply the row spacing of files that may contain blank rows. The policy first deletes the original blank rows, repeat the above work. [Plain] $ sed '/^ $/d; $! G'test.txt: [plain] $ cat test.txt line1 line2 line3 $ sed '= 'test.txt | sed' N; s/\ n // '1 line1 2 line2 3 line3 give the line number first, and then eliminate the line number and line feed to print the last line at the end is very simple: [plain] $ sed-n' $ P' test.txt is used to print the last three rows? "Rolling window" is a common method for verifying the Block Composition of the Chinese line in the mode space. Take the last three lines as an example: [plain] $ cat test.txt line1 line2 line3 line4 line5 line6 line7 line8 line9 $ sed ': start; $ q; N; 4, $ D; B start 'test.txt line7 line8 line9 1. if the current row is the last row, Exit 2. read the next row. if the row number of the current row is 3, delete the first row 4 in the mode space. repeat the preceding operation. Here, sed reads the first three rows, reads the fourth row, deletes the first row, reads the fifth row, and deletes the second row ...... Note: If the current row is in 4, $, execute D. If we want to print all the other rows except the first three rows, we can do this: [plain] $ sed ': start; $ q; N; 1, 4D; B start 'test.txt line4 line5 line6 line7 line8 line9 the key to deleting a row from a row to delete a row from a row is to find the junction between the empty row and the non-empty row [plain] $ sed '/. //,/^ $ /! D 'test.txt can match non-empty rows. The rows that are not empty rows are not deleted directly from the first empty row, and the remaining rows can be deleted. Of course, if there is a blank line at the beginning, it will also be deleted to delete the blank line starting with [plain] $ sed '/./, $! D 'test.txt is relatively simple, from the first non-empty row to the last row, not delete, all the other can be deleted. Delete the ending blank line [plain] $ sed '{>: start>/^ \ n * $/{$ d; N; B start }>} 'test.txt note: there are two curly braces. The curly braces in the middle only apply to the specified row. It can be regarded as a group of commands. If the current row is empty, the pattern match is successful, if the command group in the brackets is the last row, delete it and read the next row to the mode space. Repeat the preceding steps. If the row to be read is not empty, the mode matching fails, sed will not enter the subsequent command group. sed will continue to process the next line to delete HTML tags in sed. There is no inert match and we cannot use all tags. *? In this form, of course, the following straightforward greedy mode must also be wrong [plain] $ sed's/<. *> // G' test.html. We can simulate the inertia matching: [plain] $ sed's/<[^>] *> // G' test.html. Finally, if you want to delete empty rows, add one: [plain] $ sed's/<[^>] *> // g;/^ $/d' test.html