http://www.coder4.com/archives/3844
Need 1 to ask: give n long sequence, find out TOPK big element, use small top heap, HEAPQ module realizes.
<textarea class= "Crayon-plain print-no" style= "-moz-tab-size:4"; Font-size:12px! Important line-height:15px! Important opacity:0; z-index:0; Overflow:hidden; "ReadOnly data-settings=" DblClick ">import heapqimport randomclass topkheap (object): Def __init__ (sel F, k): self.k = k Self.data = [] def Push (self, elem): If Len (Self.data) < self.k:h Eapq.heappush (Self.data, elem) Else:topk_small = self.data[0] If elem > Topk_small: Heapq.heapreplace (Self.data, Elem) def TopK (self): return [x for x in reversed ([Heapq.heappop] (Self.dat A) for x in Xrange (Len (self.data))])]if __name__ = = "__main__": Print "Hello" List_rand = Random.sample (xrange (10000 xx) th = topkheap (3) for I in list_rand:th. Push (i) print th. TopK () print sorted (List_rand, reverse=true) [0:3] </textarea>
123456789101112131415161718192021222324252627 |
import HEAPQ import Random class topkheap(object): def __init__(self, k): Self . K = k Self . Data = [] def Push(self, elem): if len(self. Data) < self . K: heapq. Heappush(self. Data, elem) Else: topk_small = self . Data[0] if elem > topk_small: heapq. Heapreplace(self. Data, elem) def TopK(self): Return [X For X Inch reversed ([heapq< Span class= "Crayon-sy". heappop (self.< Span class= "crayon-v" >data) for x in Span class= "CRAYON-E" >xrange (len (self. Data) ] if __name__ = = "__main__": print "Hello" list_rand = random. Sample(xrange(1000000), + ) th = topkheap(3) For i in list_rand: th. Push(i) print th. TopK() Print sorted(list_rand, Reverse=True)[0:3] |
The upper HEAPQ can be easily done.
Perverted needs come: give a long sequence of n, find BTMK small elements, even with a large top heap.
HEAPQ, when implemented, did not give a Java-like Compartor function interface or comparison function, the developer gave the reason see here: http://code.activestate.com/lists/python-list/162387/
So, people came up with some very NB ideas, see: http://stackoverflow.com/questions/14189540/python-topn-max-heap-use-heapq-or-self-implement
Let me summarize one of the simplest:
Change push (e) to push (-e), Pop (e) to-pop (e).
That is, when you deposit the heap, remove it from the heap, use the opposite number, and the other logic is exactly the same as the TOPK, see the code:
<textarea class="crayon-plain print-no" style="-moz-tab-size: 4; font-size: 12px ! important; line-height: 15px ! important; z-index: 0; opacity: 0; overflow: hidden;" readonly="" data-settings="dblclick">class Btmkheap (object): Def __init__ (self, k): self.k = k Self.data = [] def Push (self, elem): # Reverse Elem to convert to max-heap elem =-elem # Using heap Algorighem If Len (Self.data) < Self.k:heapq.heappush (Self.data, elem) Else:topk_small = self.data[0] if Elem & Gt Topk_small:heapq.heapreplace (Self.data, Elem) def BTMK (self): return sorted ("X for X" in Self.da TA])</textarea>
123456789101112131415161718 |
class btmkheap(object): def __init__(self, k): Self . K = k Self . Data = [] def Push(self, elem): # Reverse Elem to convert to Max-heap elem = -elem # Using heap Algorighem if len(self. Data) < self . K: heapq. Heappush(self. Data, elem) Else: topk_small = self . Data[0] if elem > topk_small: heapq. Heapreplace(self. Data, elem) def btmk(self): return sorted([-x for x in self. Data]) |
After testing, is completely no problem, this idea is too trick ...
Python Big Top heap small top heap