Section 45 describes the concept and algorithm of the heap, the previous section describes the implementation of the Java heap class Priorityqueue,priorityqueue in addition to being used as a priority queue, but also to solve some other problems, section 45 mentions the following two applications:
- Ask k the largest element, the number of elements is uncertain, the amount of data may be large, or even a steady flow, but need to know so far the largest first k elements. The variant of this problem is: ask K the smallest element, ask K the largest, seek the smallest of K.
- For the median element, the median value is not the average, but the value of the intermediate element after sorting, and the amount of data can be large or even steady.
In this section, we will discuss how to solve these two problems.
Ask the first k the largest element
Basic ideas
A simple idea is to sort, sort and fetch the largest number of k on it, sorting can use the Arrays.sort () method, and the efficiency is O (n*log2 (N)). However, if K is small, such as 1, is the maximum value, it is not necessary to completely sort all the elements.
Another simple idea is to choose, cycle through the selection of K, each time from the remaining elements to choose the maximum value, this efficiency is O (n*k), if the value of K is greater than log2 (N), this is not as good as the full sort.
However, both of these ideas assume that all elements are known, not dynamically added. If the number of elements is uncertain, and the flow is coming?
A basic idea is to maintain an array of length k, the first k element is currently the largest k elements, after each new element to find the smallest value in the array, the new element and the minimum value, if less than the minimum value, then nothing will change, if greater than the minimum value, the minimum value is replaced with the new element.
This is somewhat similar to the end of life elimination, the new element and the original at the end of the ratio can be, or worse than the end, not to go, or to replace the original end.
In this way, the maximum number of k elements maintained in an array is always the largest, and regardless of the source data, the memory overhead required is fixed, or an array of length K. However, every element, need to find the minimum value, all need to do a K-comparison, can reduce the number of comparisons?
The workaround is to use the minimal heap maintenance of this K element, the smallest heap, the root is always the smallest, the new element and the root ratio can be, if the small root, the heap does not need to change, otherwise replace the root with the new element, and then adjust the heap downward, the efficiency of the adjustment is O (log2 (K)), so The overall efficiency is O (n*log2 (K)), which is very efficient and has low storage costs.
With the smallest heap, the k largest element is also readily available, which is the root of the heap.
Understanding the idea, let's look at the code below.
Implementation code
Let's implement a simple TOPK class, the code looks like this:
Public classTopK <E> { PrivatePriorityqueue<e>p; Private intK; PublicTopK (intk) { This. K =K; This. P =NewPriorityqueue<>(k); } Public voidAddAll (collection<?extendsE>c) { for(e e:c) {Add (e); } } Public voidAdd (e e) {if(P.size () <k) {P.add (e); return; } Comparable<?SuperE> head = (comparable<?)SuperE>) P.peek (); if(Head.compareto (E) >0){ //less than the minimum value in TOPK, without changing return; } //The new element replaces the original minimum value as one of the top K. P.poll (); P.add (e); } Public<T>t[] ToArray (t[] a) {returnP.toarray (a); } PublicE getkth () {returnP.peek (); }}
Let's explain a little bit.
TOPK internally uses a priority queue and K, the construction method accepts a parameter k, using the default constructor method of Priorityqueue, assuming that the element implements the comparable interface.
Add method, the implementation of adding elements to the dynamic, if the number of elements is less than k directly added, otherwise compared with the minimum value, only in the case of greater than the minimum value, add, before adding, first delete the original minimum value. The AddAll method loops through the Add method.
The ToArray method returns the current largest element of K, and the Getkth method returns the largest element of K.
Let's take a look at the example used:
New Topk<> (5); Top5.addall (Arrays.aslist (new integer[]{ 100, 1, 2, 5, 6, 7, 34, 9, 3, 4, 5, 8, 1, 0}); System.out.println (arrays.tostring (Top5.toarray(new integer[0))); System.out.println (Top5.getkth ());
The 5 largest elements are retained and the output is:
[21, Max, +]
The code is relatively simple and does not explain it.
To find the median value
Basic ideas
The median is the value of the middle element after sorting, if the number of elements is odd, the median is not ambiguous, but if it is even, the median may have different definitions, can be a small one, or the larger one, or the average of the two, or any one, here, we assume that any one can.
A simple idea is to sort, sort, and then take the middle of that value, which can be sorted using the Arrays.sort () method, with an efficiency of O (N*LOG2 (N)).
However, this requires that all elements are known, not dynamically added. How do you get the median of the sequence of elements that are currently entered, if the element is constantly coming?
You can use two heaps, one maximum heap, and one minimum heap, with the following ideas:
- Assuming that the current median is m, the largest heap maintains the <=m element, and the minimum heap maintains the element of >=m, but two of the heaps do not contain m.
- When a new element arrives, for example E, E is compared with M, and if e<=m, it is added to the maximum heap, otherwise it is added to the minimum heap.
- After the second step, if the difference between the minimum and maximum number of elements in the heap is >=2, M is added to the heap with a small number of elements, and the root node is removed and assigned to m from the heap with the number of elements.
We use an example to explain, for example, that the input elements are:
34, 90, 67, 45,1
When you enter the first element, M is 34.
When you enter the second element, 90 is greater than 34, the minimum heap is added, and the median value is the same as follows:
When entering the third element, 67 is greater than 34, the minimum heap is added, but after the minimum heap is added, the minimum number of elements of the heap is 2, the median and the heap need to be adjusted, the existing median value 34 is added to the maximum heap, the minimum heap root 67 is removed from the smallest heap and assigned to M, as shown in:
Enter the fourth element 45 o'clock, 45 is less than 67, add the maximum heap, and the median value is the same as shown in:
Enter the Fifth Element 1 o'clock, 1 is less than 67, add the maximum heap, you need to adjust the median and heap, the existing median value 67 is added to the minimum heap, the maximum heap root 45 is removed from the largest heap and assigned to M, as shown in:
Implementation code
Understanding the basic idea, let's implement a simple median class median, the code looks like this:
Public classMedian <E> { PrivatePriorityqueue<e> MINP;//Minimum Heap PrivatePriorityqueue<e> Maxp;//Maximum Heap PrivateE m;//Current Medium Value PublicMedian () { This. MINP =NewPriorityqueue<>(); This. Maxp =NewPriorityqueue<> (11, Collections.reverseorder ()); } Private intCompare (e E, E m) {comparable<?SuperE> CMPR = (comparable<?SuperE>) E; returnCmpr.compareto (m); } Public voidAdd (e e) {if(m==NULL){//first elementm =e; return; } if(Compare (E, m) <=0){ //is less than the median, adding the maximum heapMaxp.add (e); }Else{Minp.add (e); } if(Minp.size ()-maxp.size () >=2){ //more than the minimum number of heap elements, that is, more than the median value//Add m to the maximum heap and then remove the root from the smallest heap to MMaxp.add ( This. m); This. m =Minp.poll (); }Else if(Maxp.size ()-minp.size () >=2) {Minp.add ( This. m); This. m =Maxp.poll (); } } Public voidAddAll (collection<?extendsE>c) { for(e e:c) {Add (e); } } PublicE Getm () {returnm; }}
Code and ideas are basically the corresponding, relatively simple, it does not explain. Let's take a look at a use example:
New Median<>(); List<Integer> list = arrays.aslist (new integer[]{ 34, 90, 67, 45, 1, 4, 5, 6, 7, 9, 10 }); Median.addall (list); System.out.println (Median.getm ());
The output is a median of 9.
Summary
This section introduces the two applications of heap and Priorityqueue, asks for the first K largest elements and the median value, introduces the basic idea and implementation code, compared with the use of sorting, the use of the heap not only to achieve higher efficiency, but also can deal with the uncertain data volume and the flow of the situation, can give real-time results.
So far, we've covered two implementations of the queue, and the LinkedList and Priortiyqueue,java container classes also have a queue implementation class Arraydeque, which is based on an array implementation, and we know that, in general, because of the need to move elements, The efficiency of inserting and deleting arrays is relatively low, but the efficiency of arraydeque is very high, even higher than linkedlist, how is it implemented? Let's discuss it in the next section.
---------------
To be continued, check out the latest articles, please pay attention to the public number "old Horse Programming" (Scan the QR code below), from the introduction to advanced, in layman's words, Lao Ma and you explore the nature of Java programming and computer technology. Original intentions, All rights reserved.
Thinking Logic of computer programs (47)-Application of heap and Priorityqueue