It's easier to turn on the internet than to use awk.
Given two files a.txt and b.txt, each row is a record (assuming there are no duplicates), the intersection of two sets is required, the set of sets, the difference set, and the output results include only the unique items. The intersection is defined as a record item that appears simultaneously in two files. The set is defined as a record item that appears in any file, and the difference set (A-B) is defined as a record that appears in a and does not appear in B, and the symmetric difference is defined as a record that appears only in one file.
Suppose A.txt includes a, C, b three lines. Suppose B.txt includes D, E, C, b four lines.
Intersection, the two files are sorted together, with items that are output more than once:
$ Sort A.txt B.txt | Uniq-d
B
C
and set, the two files are sorted together, the duplicate items are counted only once:
$ Sort A.txt B.txt | Uniq
A
B
C
D
E
Chage (A-b), the elements of B repeat 2 and a of the elements are sorted together, output only one occurrence of the item:
$ sort A.txt B.txt B.txt | Uniq-u
A
Symmetric error, the two files are sorted together, outputting only the items that appear once:
$ Sort A.txt B.txt | Uniq-u
A
D
E
Turn from: http://blog.csdn.net/yinxusen/article/details/7450213