The 22nd chapter of Linux file comparison, text file intersection, difference and difference: Comm Command noun explanation
comm 命令
Can be used for comparisons between two files, and it has some options to adjust the output to perform intersection, differential, and differential set operations.
Intersection: Print rows that are common to two files
Poor: Prints out the rows that the specified file contains that are not the same.
Difference: Prints the lines that are contained in one file but are not included in the other specified file.
Grammar
comm(选项)(参数)
Options
-1: Does not display the contents of the first file appearing;
-2: Does not display the content that appears in the second file;
-3: Does not display content that appears in all two files at the same time.
?
Parameters
Instance
[[email protected] comm]# cat aaa.txt aaabbbcccdddeee111222[[email protected] comm]# cat bbb.txt bbbcccaaahhhtttjjj
[[email protected] comm]# comm aaa.txt bbb.txt aaa bbb ccccomm: file 2 is not in sorted order aaadddeeecomm: file 1 is not in sorted order111222 hhh ttt jjj
Output first column: represents content contained by Aaa.txt
Output second column: represents content contained by Bbb.txt
Output third column: Represents the same row in Aaa.txt and Bbb.txt. Each column is a tab character (\ t) as the delimiter.
File 1 is ' not ' in sorted order: This means that the contents of the document are not ordered in order. No relationship is used.
Intersection:
To print the intersection of two files, you need to delete the first and second columns:
[[email protected] comm]# comm aaa.txt bbb.txt -1 -2bbbccc
Poor:
Print out the rows that are not identical in the two files, and you need to delete the third column:
[[email protected] comm]# comm aaa.txt bbb.txt -3aaa aaadddeee111222 hhh ttt jjj
[[email protected] comm]# comm aaa.txt bbb.txt -3 | sed ‘s/^\t//‘comm: file 2 is not in sorted ordercomm: file 1 is not in sorted orderaaaaaadddeee111222hhhtttjjj
Sed ' s/^\t//' is to delete the tab (\ T) to merge the two columns into one column.
Subtraction
By removing unwanted columns, you get the difference between Aaa.txt and bbb.txt:
The difference of Aaa.txt
[[email protected] comm]# comm aaa.txt bbb.txt -2 -3aaadddeee111222
The difference of Bbb.txt
[[email protected] comm]# comm aaa.txt bbb.txt -1 -3 aaahhhtttjjj
22nd. linux file comparison, text file intersection, difference set and difference: Comm Command