There are 10 million messages, there are duplicates, as a text file to save, one line, there are duplicates.
There are 10 million messages, there are duplicates, as a text file to save, one line, there are duplicates.
Please use 5 minutes to find the top 10 repetitions.
Analysis:
The general approach is to sort first, and then iterate through it to find the top 10 repetitions. But the order of the algorithm is the lowest complexity of NLGN.
You can design a hash_table, hash_map<string, Int>, read 10 million text messages sequentially, load into the hash_table table, and count the number of repetitions, while maintaining a maximum of 10 SMS tables.
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/Programming/sjjg/
This traversal can be used to find the top 10, the complexity of the algorithm is O (n).
Implemented as follows:
#include <iostream> #include <map> #include <iterator> #include <stdio.h> using namespace std
;
#define HASH __gnu_cxx #include <ext/hash_map> #define uint32_t unsigned int #define uint64_t unsigned long int struct Strhash {uint64_t operator () (const std::string& str) const {uint32_t b
= 378551;
uint32_t a = 63689;
uint64_t hash = 0;
for (size_t i = 0; i < str.size (); i++) {hash = hash * a + str[i];
A = a * b;
return hash; } uint64_t operator () (const std::string& str, uint32_t field) Const {uint32_t b = 3
78551;
uint32_t a = 63689;
uint64_t hash = 0;
for (size_t i = 0; i < str.size (); i++) {hash = hash * a + str[i]; A = A * b;
hash = (hash<<8) +field;
return hash;
}
};
struct namenum{string name;
int num;
Namenum (): num (0), name ("") {}};
int main () {hash::hash_map< string, int, strhash > names;
hash::hash_map< string, int, strhash >::iterator it;
Namenum NAMENUM[10];
String L = "";
while (Getline (CIN, L)) {it = Names.find (l);
if (it!= names.end ()) {names[l] + +;
else {names[l] = 1;
NAMES[L] = 1;
} int i = 0;
int max = 1;
int min = 1;
int minpos = 0;
for (it = Names.begin (); it!= names.end (); + + it) {if (I < 10) { Namenum[i].name = it->first;
Namenum[i].num = it->second;
if (It->second > Max) max = it->second;
else if (It->second < min) {min = it->second;
Minpos = i;
} else {if (It->second > Min)
{namenum[minpos].name = it->first;
Namenum[minpos].num = it->second;
int k = 1;
min = Namenum[0].num;
Minpos = 0; while (K <) {if (Namenum[k].num < min) {min = namenum[
K].num;
Minpos = k;
K + +;
}} i++;
} i = 0;
cout << "MaxLength (string,num):" << Endl; while (I <) {cout << "(" << namenum[i].name.c_str () << "," << N
Amenum[i].num << ")" << Endl;
i++;
return 0; }
Compile with g++ as follows:
g++ Main.cpp-o Main
SMS text file is: msg.txt
Running:./main < Msg.txt
The output results are:
MaxLength (String,num):
(Little Mother's Square, 4)
(Agricultural machinery Parts maintenance, 5)
(Red-Sheng Supermarket, 6)
(Dragon Creek Hotel, 8)
(Zhang Kee Dumpling Hall, 3)
(Friendship Inn, 3)
(Pearl Communication, 3)
(Jinyuan Hotel, 3)
(Dongting Natural Spring, 2)
(Qingyuan Supermarket, 3)