The TOPK problem, which is finding the largest number of K, is very common, such as finding the hottest 10 keywords from 10 million search records.
Method One:
First, then the number of the first k is truncated.
Time complexity: O (N*logn) +o (k) =o (N*LOGN).
Method Two:
Minimum heap.
Maintain the smallest heap with a capacity of K. According to the minimum heap nature, the heap top must be the smallest, if smaller than the heap top, then the direct pass, if greater than the heap top, then replace the heap top, and heapify the heap, wherein heapify time complexity is logk.
Time complexity: O (k + (n-k) *logk) =o (N*LOGK)
Method Three:
The protagonist of this paper. The quick Select algorithm. In fact, it is similar to the quick row. The difference is that the quick select only needs to go in one direction per trip.
Time complexity: O (n).
def qselect (a,k): If Len (a) <k:return A pivot = a[-1] Right = [Pivot] + [x for x in a[:-1] if x>=pivot]
rlen = Len (right) if rlen==k: return right if rlen>k: return Qselect (right, K) else: left = [x for x in a[:-1] if X<pivot] return Qselect (left, K-rlen) + rightfor I in range (1): print Qselec T ([11,8,4,1,5,2,7,9], i)
Elegant Python-A quick select solution for big Data TOPK problems