Algorithm-median number of sliding windows (heap)

Source: Internet
Author: User

Today on the Internet to brush a question about the heap, feel something to gain. Because before this, there was never a problem with the heap before.

Test instructions

given an array of n integers, and a sliding window of size k, from left to right in the array Slide the window to find the median of each window in the array. (If the number of arrays is even, the N/2 number is returned after the window sorts the numbers.) )

Examples:

For array [1,2,7,8,5], when sliding the window of size k = 3, return [2,7,7]  Initially, the window array is like this:  [| 1,2,7 |, 8,5], returns the median number 2;  The window then continues to slide forward one time.   [1, | 2,7,8 |, 5], returns the median 7;  The window then continues to slide forward one time.   [1, 2, | 7,8,5 |], returns the median 7;

Initially see this problem, think of the method is violence, the given window inside the number binate to the big sort, then go to the middle of the number on the line. However, the later discovery timed out. As a result, the online search for the relevant solution, most of the online use of priority queue to operate. Basic idea: Use two priority queue to imitate the big top heap and small top heap (about the meaning of big top heap and small top heap, here no longer explain in detail), then operate two queue data, select the median.

1. Thinking of solving problems

(1). Priority Queue

Use a priority queue to mimic the big top pile, the small half of the number that is already in the window, and the other in the small top pile, and the other half. Thus, the remaining one (not entering any queue) is the median

(2). Initial Add Data

We assume that two queues are empty at initialization time, and at first time, there is no data in the window, and the current median is set to the number that will enter the window. So by default, adding data to a window starts with the second.

When we add a data, we judge it to be the size of the current median (the first data as the median), and if it is larger than the current median, put it in a small top heap, and vice versa. After putting in (whether it is a large heap or a small top heap), judge the size of the big top heap and small top heap size, if the big top heap size is larger than the small top heap size, then the current median is placed in the small top heap, from the large top heap to take a data (poll) as the new median; 1 is larger than the size of the big top heap, then the current median is put into the big top heap, and a new median is taken from the top heap. :

Question: Why is the size of the big top heap larger than the small top heap size on the adjustment, and the small top heap of size-1 larger than the size of the heap to adjust it?

Let's take a look at the question: if the number of arrays is even, then the N/2 number is returned after the window is sorted. Here, if the number of numbers is even, then take N/2. For example: There are 3 numbers in the window: 1 2 3, the median is 2, this is not disputed; But if the window has 4 numbers: 1 2 3 4, according to test instructions, select N/2, then 2, that is, the current Big Top heap is: 4, and the small top heap: 1 2. So the size of the small top heap is size-1 larger than the big top heap, because if not, the current median is not in line with the median number of the current window, because the window has joined a number before.

Instead of adding a number, the median has two results: when the size of the two top heap is equal or the size of the small top heap-the size of the large top heap = 1 o'clock, the median is the same, because the size of the two top heap is balanced (we can think that the number in the current window is smaller than the half on the left of the window The large half is placed on the right side of the window, while the median is in the middle of the two top piles; when the size of the large top heap is greater than the size of the small top heap, the actual median is biased toward the large top heap, so the current median is put into the small top heap, and the maximum value from the large top heap is taken All the numbers in the Big top heap are smaller than the current median, and all the numbers in the small top heap are larger than the current median, so put the current number in the Big Top heap. In a word, after adding a number, you must ensure that the current median is the actual median number.

(3). Subsequent additions to the data

First, we add the data according to the previous rules based on the median in the initial add data step.

When we add one, we have to remove one of the two top heaps to ensure that the current window displays the number. We know for sure to remove the first one in the current window, how to remove it? First value of the current window, this value is compared with the current median, if it is greater than the median, then this value in the small top heap, is removed from the top heap, if less than the median, indicating that the value is in the large top heap, is removed from the Big top heap, when equal to the median, Then judge the size of the current small top heap and the size of the big top heap, if it is larger than the large top heap, then take a data from the top heap as the new median; instead, take a data from the big top heap as the new median.

question: Why do you do this when removing the value that is equal to the current median?

First, after adding data, the size of the big top heap is either equal to the size of the small top heap, or the size of the big top heap-the small top heap. Sze = 1. And this time after adding data, either join the Big Top heap, or join the Big top heap, according to size, when the big top heap size >= small top heap size, indicating that the current added data is less than the current median, then the actual median number is biased toward the big top heap, So, taking a data from the big top heap as the new median, and vice versa, taking a data from the top heap as the new median is also the truth.

(4). Further adjustments

In the case of (3), before removing the size of the large top heap-the small top heap. Sze = 1, and the removed data is in the large top heap, the actual median is biased toward the small top heap, so further adjustment is needed, and if the size of the large top heap is equal to the small top heap. Sze, and the removed data is in the small top heap , then the actual median is biased towards the large top heap.

2. Code

Now that you're done explaining, start coding.

1      Public classMincomparatorImplementsComparator<integer> {2          Public intCompare (integer A, integer b) {3             if(A >b)4                 return1;5             Else if(A = =b)6                 return0;7             Else8                 return-1;9         }Ten     } One  A      Public classMaxcomparatorImplementsComparator<integer> { -          Public intCompare (integer A, integer b) { -             if(A >b) the                 return-1; -             Else if(A = =b) -                 return0; -             Else +                 return1; -         } +     } A  at      PublicList<integer> Medianslidingwindow (int[] Nums,intk) { -list<integer> res =NewArraylist<integer>(); -         if(k = = 0 | | Nums.length <k) { -             returnRes; -         } -Priorityqueue<integer> Maxqueue =NewPriorityqueue<> ();//Big Top Pile inPriorityqueue<integer> Minqueue =NewPriorityqueue<> ();//Small Top Pile -         intMedia = Nums[0]; to         //the initial Add data +          for(inti = 0; I < K; i++) { -             if(Media <Nums[i]) { the Minqueue.offer (Nums[i]); *}Else { $ Maxqueue.offer (Nums[i]);Panax Notoginseng             } -             if(Maxqueue.size () >minqueue.size ()) { the minqueue.offer (media); +Media =Maxqueue.poll (); A}Else if(Maxqueue.size () < Minqueue.size ()-1) { the maxqueue.offer (media); +Media =Maxqueue.poll (); -             } $         } $ res.add (media); -         //subsequent additions to the data -          for(inti = 0; i < nums.length; i++) { the             if(Media <Nums[i]) { - Minqueue.offer (Nums[i]);Wuyi}Else { the Maxqueue.offer (Nums[i]); -             } Wu             //Remove the first value of the current window -             intOld = Nums[i-K]; About             if(Old = =Media) { $                 if(Minqueue.size () >maxqueue.size ()) { -Media =Minqueue.poll (); -}Else { -Media =Maxqueue.poll (); A                 } +}Else if(Old <Media) { the maxqueue.remove (old); -}Else { $ minqueue.remove (old); the             } the             //further adjustments the              while(Maxqueue.size () >minqueue.size ()) { the minqueue.offer (media); -Media =Maxqueue.poll (); in             } the              while(Minqueue.size () < Minqueue.size ()-1) { the maxqueue.offer (media); AboutMedia =Minqueue.poll (); the             } the res.add (media); the         } +         returnRes; -}

Algorithm-median number of sliding windows (heap)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.