A bit of detail about FS in awk
When learning objective awk Program. Although shell is a conventional weapon at work, it is not very familiar with the use of shell, so I cannot understand it deeply, or record what I did not notice, after all, each time you read an English document, it is not very important.
Invalid awk Program Chapter 3 Using Regular Expression to Separate Fields in Reading Input Files mentions an interesting phenomenon.
Echo "a B c d" | awk '{print $2 }'
Echo "a B c d" | awk 'in in {FS = "[\ t \ n]"} {print $2 }'
Whether the two outputs are consistent. Before I learned this chapter, I thought that the outputs are the same, both of which are B. Actually:
We can see that the output of the first command is different from that of the second command. The reason is that, by default, FS is a space. In this case, before processing, the strip will first drop the leading space and tab, as well as the trailing space and tab, however, if FS is changed to [\ t \ n], the blank characters in the header and tail are not strip. If there is a space in the header, we can see that $1 is null or empty.
Another interesting phenomenon is that record re-creation will lead to a blank character strip in the header and tail.
We can see that only the seemingly meaningless operation $2 = $2 is executed, and the space in the header is dropped by strip. In fact, the two spaces at the end are also dropped by strip. Because the value assignment operation triggers a string rebuild, And the rebuild process needs to find $1, $2... $ NF: link up. The process of searching for $1 is equivalent to $1 when FS = "". Blank characters (spaces and tabs) are ignored, concatenated string has no blank characters in the header and tail.