Because the company is mainly to do the text messaging industry, usually and mobile phone number to deal with more, a variety of exotic demand is more, recently received a director of the wonderful demand, is the two files in the same mobile phone number processing, due to programming level and Excel play limited, so can only think of other measures to solve, First, there are several fields for each file, but they are all structured data in the following format:
15994710001,2016/11/3 0:24,5310001015994710001,2016/11/3 0:24,5310001015001313373,2016/11/3 3:39,5310001013937713309,2016/11/3 6:16,5310001013758943333,2016/11/3 7:19,5310001013868044333,2016/ 11/3 8:33,5310001013500732333,2016/11/3 10:29,5310001013523072333,2016/11/3 10:30,5310001015138132777,2016/11/3 10:31,5310001013960985779,2016/11/3 10:45,53100010 This file has more than 4,000 lines, file 2 Field more, just a part of the content is garbled, so also to protect personal privacy it. "311-sd10658" 2114781676479382330 "," 13703774555 "," 11λp50rit "," 1 "," 2016/11/3 10:07:43 "," 2016/11/3 10:07:41 "," 0 "," DELIVRD "" 311-sd10658 "2114781676479382330", "15920510111", "11λp50rit", "1", "2016/11/3 10:07:43 "," 2016/11/3 10:07:41 "," 0 "," DELIVRD "" 311-sd10658 "2114781676479382330", "18319609333", "11λp50rit", " 1 "," 2016/11/3 10:07:43 "," 2016/11/3 10:07:41 "," 0 "," DELIVRD "" 311-sd10658 "2114781676479382330", " 15221090555 "," 11λp50rit "," 1 "," 2016/11/3 10:07:43 "," 2016/11/3 10:07:41 "," 0 "," DELIVRD "" 311-sd10658 " 2114781676479382330"," 13905879555 "," 11λp50rit "," 1 "," 2016/11/3 10:07:43 "," 2016/11/3 10:07:41 "," 0 "," DELIVRD "" 311-sd10658 " 2114781676479382330 "," 13818586777 "," 11λp50rit "," 1 "," 2016/11/3 10:07:43 "," 2016/11/3 10:07:41 "," 0 "," Delivrd "" 311-sd10658 "2114781676479382330", "13916387773", "11λp50rit", "1", "2016/11/3 10:07:43", "2016/11/3 10:07:41 "," 0 "," DELIVRD "" 311-sd10658 "2114781676479382330", "13882133333", "11λp50rit", "1", "2016/11/3 10:07:43 "," 2016/11/3 10:07:41 "," 0 "," DELIVRD "" 311-sd10658 "2114781676479382330", "18200980999", "11λp50rit", " 1 "," 2016/11/3 10:07:43 "," 2016/11/3 10:07:41 "," 0 "," DELIVRD "
Treatment of the idea:
Because just want the same number, so under Linux with some text processing tools to deal with it, first processing it into a mobile phone number of the file, and then do other processing
You can intercept related columns with cut or awk, but because awk is not familiar, you can use cut interception, note the delimiter, and the related column.
You can then use grep to compare and try diff, but the effect
1, statistics two text files of the same line
GREP-FF file1 file2
2, Statistics file2, file1 not in the row compared two different rows
GREP-VFF file2 file1
This article is from the "Keep Dreaming" blog, please be sure to keep this source http://dreamlinux.blog.51cto.com/9079323/1869844
Remember the process of data processing