Main reference:
Http://www.cnblogs.com/yaozhongxiao/archive/2009/09/23/1572955.html
Http://hi.baidu.com/sadawn/blog/item/4fdaee2433b7ed154d088d49.html
Thoughts on solving this problem:
First, find out the fact that if the large array itself has been sorted in ascending order, the first 10 thousand elements of the array are the results. Then, it can be assumed that the large array has been ordered from large to small, and the first 10 thousand elements are placed in the result array. Again, in fact, this result array may not be the largest 10 thousand, thereforeSubsequent element comparison with the minimum element of the array,If all subsequent elementsBijiejieIf the minimum element of the array is smallThe result array is the expected result.The subsequent elements are larger than the minimum elements in the result array., Use it.ReplaceThe smallest number in the result array. Finally, after traversing the large array, the returned result array is the expected result.
So the original code:
Template <class T>
Void solution_3 (T bigarr [], t resarr [])
{
// Obtain the first 10 thousand
Memcpy (resarr, bigarr, sizeof (t) * res_arr_size );
// Indicates whether exchange has occurred.
Bool bexchanged = true;
// Traverse subsequent elements
For (INT I = res_arr_size; I <big_arr_size; ++ I)
{
Int idx;
// If an exchange occurred in the previous round
If (bexchanged)
{
// Find the smallest element in resarr
Int J;
For (idx = 0, j = 1; j <res_arr_size; ++ J)
{
If (resarr [idx]> resarr [J])
Idx = J;
}
}
// Replace this element if it is greater than the smallest element in resarr.
If (bigarr [I]> resarr [idx])
{
Bexchanged = true;
Resarr [idx] = bigarr [I];
}
Else
Bexchanged = false;
}
}
The above Code uses a Boolean variable bexchanged to mark whether exchange has occurred. This is an optimization method not mentioned previously-used to mark the state of element exchange, this greatly reduces the number of times the resarr element is searched. Perform a test on solution_3. The result takes about 2.0 seconds (up to 32 minutes if bexchanged is not used), which is far less than that of solution_2.
A better solution:
Keep these 10 thousand numbers in order and compare them with the smallest of them. The sorting of 10 thousand numbers is troublesome,Maintaining an orderly state requires more costs.
Where are hot spots?First, traversing these 1 billion numbers is definitely neededBut among the 10 thousand numbers, we need to find the smallest one.
Better solution:
Are these 10 thousand numbers sorted?Of course not required,All we need is the smallest number in this array.To compare, so,Minimum heapIt is more suitable here.
Template <class T>
Void solution_3 (T bigarr [], t resarr [])
{
// Obtain the first 10 thousand
Memcpy (resarr, bigarr, sizeof (t) * res_arr_size );
Build a small top heap; // the cost of building the heap here is: theta (N)
// Indicates whether exchange has occurred.
Bool bexchanged = true;
// Traverse subsequent elements
For (INT I = res_arr_size; I <big_arr_size; ++ I)
{
If (bigarr [I]> resarr [1]) // resarr [1] is the first element of a small top heap, that is, the smallest number of 10000
{
Resarr [idx] = bigarr [I];
Sift_down (resarr, 1); // if it is switched to the minimum number of the small top heap, sift_down is required to adjust the heap creation; while sift_down is only required to compare the maximum number of sift_down
// 14 times, because the 10000 full binary tree consists of only 14 layers
}
}
}
So far, the complexity of this algorithm has been greatly improved!
OK. Is it perfect when it comes to it? I won't satisfy you. If there is no better solution in the algorithm, I will do multithreading, platform optimization, and code optimization.
Can't I optimize it any more? The heap is just a conventional solution. The Fibonacci heap, multi-level buckets, and hot queue are all alternative solutions for Binary heap.