A few days ago, I saw a post on csdn at http://bbs.csdn.net/topics/390848841. The owner posted the following question:
Example:
12345
67890
1234567890
123
4567890
How can we convert the above data
1234567890
1234567890
1234567890
After reading replies from several netizens, I think it is quite interesting and I have gained some knowledge, because some netizens only provide the solution. If I have not explained it, I will understand it as I understand it, explain a few answers (not necessarily accurate. Please correct the error)
I think the wrong answer1 first, the landlord gave an answer that he could not Debug:
sed ‘s/(?!90)\n//g‘
?! It should not end with pattern. The answer should be to change the linefeed of a line not ending with 90. This answer should not be feasible. Sed adopts the single-line mode by default. when processing a row at a time, even if we remove the line break, the SED command still treats it as a complete line of output (that is, the line break should be automatically added.) 2. one netizen thought the other answers were too complicated and gave the following method:
sed ‘N;s/\n//g’
This answer uses the SED multi-line mode, but it cannot be achieved. Let's look at the running result"
[email protected]:~/Windeal/shell$ sed ‘N;s/\n//g‘ a.txt
1234567890
1234567890123
4567890
We can see that the SED n command connects the second line to the end of the first line, and the fourth line to the end of the third line, without considering our target: only the end of 90 is used to append the next line. Therefore, the 123 in the fourth row is appended to the 1234567890 in the third row, and an error occurs.
I think the right answerLet's look at two answers I think are right: 1.
sed -e ‘/^/{:loop /90$/!{N;b loop};s/\n//g}‘ test.txt > t2.txt
A better understanding is:
sed ‘{:myloop /90$/!{N;b myloop};s/\n//g}‘ a.txt
/^/I didn't understand it. It should be matching every line. In this command, myloop is a custom tag. Similar to the labels used by Goto in programming languages. N indicates multi-row mode, B Indicates Branch (equivalent to goto). Sed reads the content of a row. If this row does not end with 90,/90 $ /! Enter {n; B myloop }. N enters the multi-row mode, attaches the next row, and then enters branch B myloop. It judges whether it ends with 90 and reads it repeatedly until it reads the row ending with 90. Next, the read of the mode space is complete. Enter the next command, replace, S/\ N // G? This command converts all linefeeds in the current mode space to null, that is, concatenates them into a row. In this way, we have achieved our goal.
2.
awk ‘{if($0~/90$/){print}else{printf("%s",$0)}}‘ a.txt
This answer reads a row of fields (actually one field) represented by $0, and then if ($0 ~ /90 $/) Determine whether the end is 90. If it ends with 90, the current row of data is output. If not, format the input preflight string.
Note the differences between print and printf.
Awk and sed: An Example of Multiline Processing