Minheap+hashmap combination to solve dynamic TOPK problem (complete implementation of heap sequencing)

Source: Internet
Author: User

There are two general workarounds for TOPK: Heap Ordering and partition. The former is implemented with the priority queue, the time complexity is O (NLOGK) (N is the total number of elements), the latter can directly call the C + + STL Nth_element function, Time complexity O (N). If you want to get dynamic update data TOPK is not so easy, such as real-time update of the most visited Top10 URLs, obviously in addition to maintaining a size of 10 of the minimum heap also requires a hash table in real time to record each URL of the number of visits, and decide whether to dynamically join the largest heap, The elements in the heap may also be deleted. So how do you get the location of the URL in the heap? Another HashMap record URL and its corresponding offset are required. Because the heap needs to be adjusted, if the priority queue implementation is not required, the priority queue can only operate on the top element of the heap, so it is necessary to simulate the implementation of the heap with an array. The idea is still relatively clear, the following code!

typedefLongURL;//URL simplified to longstructnode{URL site; intCNT; Node (URL s,intc): site (s), CNT (c) {}BOOL operator< (Constnode& other)Const{//Comparator. Pay attention to Const        return  This->cnt <other.cnt; }};classwebcounter{Private: Unordered_map<url,int> countermap;//Count MapUnordered_map<url,int> offsetmap;//Offset MapVector<node>Minheap; intsize{1};//The initial size of 1,idx is 0 with no content stored. Two useful: 1.offset default is 0, indicating no record; 2. Facilitates heap adjustment operations    intK//Top K Public: WebCounter (intk): K (k) {minheap.resize (1,{0,0}); }    voidWork (url url) {intcurcnt = ++countermap[url];//Update Count        if(Offsetmap[url] >0) {//offsetmap Record description in TOPK            inti = Offsetmap[url];//Remove offset idxShiftdown (i);//The count is increased and may be greater than the following number, so shift down}Else if(Size <= K) {//indicates that the number of elements in the heap is less than k and continues to increaseMinheap.push_back (Node (URL,1)); Offsetmap[url]= ++size;           Shiftup (size); //The added count is 1 and must be the smallest, so shift up}Else if(minheap[1].cnt < current) {//size has reached K, so the new element will replace the top element of the heap if it is larger than the top of the heap. Cases larger than the top of the heap occur at the beginning of the count, exactly equal to the top element of the heap, after +1 is greater. minheap[1] =Node (URL, curcnt); Shiftdown (1);//after replacing is the first element, so just look at shift down        }            }    voidShiftup (inti) { while(I >1&& Minheap[i] < minheap[i/2]) {swap (minheap[i], minheap[i/2]);//these three lines are packaged to be more elegant.offsetmap[minheap[i/2].site] = i/2; Offsetmap[minheap[i].site]=i; I>>=1; }    }        voidShiftdown (inti) { while((i=i*2) <=size) {            if(i+1<= size && minheap[i+1] <Minheap[i]) {                ++i; }            if(Minheap[i] < minheap[i/2]) {swap (minheap[i], minheap[i/2]); Offsetmap[minheap[i/2].site] = i/2; Offsetmap[minheap[i].site]=i; } Else {                 Break; }                    }    }};

The code uses two adjustment heap functions Shiftdown and shiftup, which are relatively concise, with complete code related to heap ordering.

Template <classT>voidSwap (t& A, t&b) {T T=A; A=b; b=T;} Template<classT>voidShiftup (Vector<t> A,inti) { while(I >1&& a[i/2] <A[i]) {Swap (A[i], a[i/2]); I>>=1; }}template<classT>voidShiftdown (Vector<t> A,intSizeinti) { while((i=i*2) <=size) {        if(i+1<= size &&a[i] < a[i+1]) {            ++i; }        if(a[i/2] <A[i]) {Swap (A[i], a[i/2]); } Else {             Break; }}}template<classT>voidMakeheap (Vector<t> A,intN) { for(inti = n/2; i >0; i--) {Shiftdown (A, n, i); }}template<classT>voidInsert (Vector<t> A,int&size, T x) {a[++size] =x; Shiftup (A, size);} Template<classT>voidDel (Vector<t> A,int& Size,inti) {A[i]= a[size--]; if(I >0&& a[i/2] <A[i])    {Shiftup (A, I); } Else{Shiftdown (A, size, i); }}template<classT>voidHeapsort (Vector<t> A,intN) {makeheap (A, n);  for(inti = n; i >1; i--) {swap (a[i], a[1]); Shiftdown (A, I-1,1); }}

Minheap+hashmap combination to solve dynamic TOPK problem (complete implementation of heap sequencing)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.