The problem is as follows:
Http://topic.csdn.net/u/20080604/16/95508149-b7a2-4f90-8797-b8423dae36fc.html
Find the top 10 floating point numbers among the 1 billion floating point numbers and write a high-performance algorithm.
The following is the answer when you visit the Forum in the District:"
Very familiar type. Use STL: map for filter. The complexity is O (log (k) * n). Here N is 10 ^ 9, which is 10.
The invariant in the process is AMAP. Size () <= K. You can write the implementation ."
However, after a while, the District felt that this answer was a wrong answer, although there was no direction error.
Therefore, it takes a lot of time for the partition to implement its own answers. The final Algorithm Implementation surprised you:
#include<iostream>
#include<iomanip>
#include<set>
using namespace std;
float genNext()
{
return rand()*1.0*rand() + rand();
}
int main()
{
multiset<float> kmax;
for(int ij=0; ij<10; ++ij)
kmax.insert(genNext());
for(int j=0; j<10; ++j)
for(int i=0; i<100000000 ;++i) {
float t = genNext();
if (*(kmax.begin()) < t) {
kmax.insert(t);
kmax.erase(kmax.begin());
}
}
multiset<float>::iterator iter=kmax.begin(),end=kmax.end();
for(; iter!=end; ++iter)
cout<<*iter<<endl;
return 0;
}
In the past, I used ACM algorithm as a post-dinner exercise. It was just a blink of an eye to think of this algorithm.
However, in the process of implementing this small idea, the problem arises one by one and needs to be addressed:
- If map is used, excessive space will be wasted, even if only O (K), k = 10 ~
- If the input set is repeated, Multiset is correct.
- It is not practical to say that there are 1 billion memory inputs. It is even more ridiculous to generate several GB of test files, so we use generator: gennext ()
- Rand () is a time-consuming operation. Therefore, gennext () misjudges the algorithm's time expectation.
- 1 billion is really an annoying order of magnitude. It is written into two parts by using an int loop.
- Due to the above misjudgment, it was time to use vector plus STD: push_heap and STD: pop_heap to implement
- All other details ............
In retrospect, only note that the write implementation is correct.
Now, we have tried kompozer to write this blog. It was about three hours later.
Someone may ask, "How many lines of code can you write in a day ?"
The answer is: "If you are lucky, there will be 15 lines ."