18.6 Describe an algorithm to find the smallest one million numbers in one billion numbers. Assume that the computer memory can hold all one billion numbers.
This problem lets us find the smallest 1 million numbers in 1 billion numbers and limits the amount of memory that a computer can have to save 1 billion digits. This problem has three solutions, sort, minimum heap, and select Sort.
First of all, the sorting method, this method is simple and clear, that is, the 1 billion numbers in ascending order, and then return to the first 1 million, the time complexity is O (NLGN).
Then look at the minimal heap approach, we build a maximum heap (big numbers at the top), and then add the first 1 million numbers in. Then we go through the rest of the numbers, for each number, we add it to the heap, and then we delete the largest number in the heap. After the traversal is accepted, we have 1 million minimum numbers, the time complexity is O (NLGM), where M is the number of numbers we need to find.
Finally, we look at the choice of sorting method, this method can be found in the linear time of the maximum or minimum number, if the number is not different, then we can find in O (n) time the smallest number I, the algorithm is as follows:
1. Randomly select a number in the array as pivot, and then use this to split the array and record the number of digits to the left of the split.
2. If there is exactly one number on the left, return the largest number on the left.
3. If the number on the left is greater than I, then continue calling this method recursively on the left.
4. If the number on the left is less than I, the method is called recursively on the right, but rank changes to i-left_size at this point.
See the code below:
intPartition (vector<int> &array,intLeftintRightintpivot) { while(true) { while(left <= right && Array[left] <= pivot) + +Left ; while(left <= && array[right] > Pivot)--Right ; if(left >right)returnLeft-1; Swap (Array[left], array[right]); }}intFind_max (vector<int> &array,intLeftintRight ) { intres =int_min; for(inti = left; I <= right; ++i) {res=Max (res, array[i]); } returnRes;}intSelection_rank (vector<int> &array,intLeftintRightintrank) { intPivot = Array[rand ()% (Right-left +1) +Left ]; intLeft_end =partition (array, left, right, pivot); intLeft_size = Left_end-left +1; if(Left_size = = Rank +1)returnFind_max (array, left, left_end); Else if(Rank < left_size)returnSelection_rank (array, left, left_end, rank); Else returnSelection_rank (Array, Left_end +1, right, rank-left_size);}
Once you have found the smallest number I, you can traverse the entire array to find all elements that are less than or equal to that number. When the array has duplicate elements, it is necessary to modify some places, but time is not guaranteed to be linear. In fact, there are algorithms can be processed in linear time with duplicate arrays, but more complex, interested in self-search research.
Careercup all in one topic summary
[Careercup] 18.6 Smallest one Million Numbers the smallest 1 million digits