Recently due to data migration, some user information needs to be re-confirmed, perhaps a large amount of data, but the need for the final confirmation of such as the user ID and its corresponding number of user points, which will lead to the occurrence of text a (old data), Text B (new data). Like what
1101 123 1102 111 1103 145 1104 the
This is the text a.txt
Text B.txt as follows
1101 123 1102 the 1103 154 1104 the
The text examples listed are just to illustrate the simplest text that is enumerated using the method, the actual data may be much more complex, the ID appearing in B.txt may not appear in the a.txt, and so on, just to illustrate the high-level application of the awk associative array, easy to understand.
We will send the number of b.txt in the ID 1102 and 1103 users of the integration is different from the previous, then how to deal with the shell? Here is an introduction to the powerful Text tool awk:
Extract the second column of two text the same and combine the points:
awk ' nr=fnr{a[$1]=$2}nr!=fnr{$2==a[$1] print $0,a[$1]} ' a.txt B.txt can get the following results:
1101 123 123 1104 the the
Where Nr,fnr is the built-in data variable for awk, where NR is the total number of rows of data that has been processed, FNR the total number of data rows in the current data file that has been processed. It's easy to understand. NR reads the total number of two lines of text for execution, while FNR reads the number of first lines of text, so the judging condition can be judged by "! =" or by ">". Instead, A[$1]=$2 assigns the second field in the first line of text to array A, and then when it is processed to the second text, it can be judged with an already assigned array for conditional output.
Then the second column has a different number of points and so on.
Shell's awk associative array advanced application