The first 10 text files with the maximum number of duplicates are retrieved. 1. & nbsp; retrieve the first 10 text files with the maximum number of duplicates. Example text: 098123234789 ...... 234678654123 think about it and share it to: 1 million text files with the largest number of duplicates, and retrieve the first 10.
1. for text files with a maximum of 1 million records, retrieve the first 10 records with the maximum number of duplicates.
Example text:
098
123
234
789
......
234
678
654
123
Thoughts
Share: More
------ Solution --------------------
Import the data to the table and use SQL statistics. You can try.
------ Solution --------------------
Explode // Read and split into arrays
Array_count_values // counts repeated times
Arsort // Sort. The result is displayed.
------ Solution --------------------
It can process text in blocks and record the results. it is estimated that the memory will not be able to eat if one-time reading is performed...
------ Solution --------------------
$ Fp = fopen ('file', 'r ');
While ($ buf = fgets ($ fp )){
$ Res [$ buf] ++;
}
Fclose ($ fp );
Arsort ($ res );
$ Res = array_keys (array_slice ($ res, 0, 10 ));
Print_r ($ res );
If half of the 1 million records are unique, there is no big difference with the following algorithm.
$ A = file ('file ');
$ Res = array_count_values ($ );
Arsort ($ res );
$ Res = array_keys (array_slice ($ res, 0, 10 ));
Print_r ($ res );