Thinking logic of computer programs (45) and thinking 45
The previous sections introduced the basic container classes in Java. Each container class has a data structure behind it. ArrayList is a dynamic array, sorted list is a linked list, And HashMap/HashSet is a hash table, treeMap/TreeSet is a red-black tree. This section describes another data structure-heap.
Introduce heap
We mentioned heap before, where the heap refers to the area in the memory and stores the dynamically allocated objects, which correspond to the stack. The Heap here is a data structure that has nothing to do with the memory area and allocation.
What is the heap structure? We will take a closer look at this later. Let's first explain how to use the heap? Why do you want to introduce it?
Heap can solve many problems very efficiently and conveniently, for example:
- Priority queue: the queue implementation class queue list we introduced previously is added in the order of queuing, but in reality, priority is often required, each time we should process the highest priority in the current queue, high priority should be given priority even if it comes late.
- Find the first K largest elements. The number of elements is uncertain, and the data volume may be large, or even continuously arrive. However, you need to know the largest first K elements so far. The variants of this problem are: Find the first K smallest elements, find the K largest, and find the K smallest.
- Find the mean element. The mean value is not the average value, but the value of the element in the middle after sorting. Similarly, the data volume may be large and may even come continuously.
Heap can also achieve sorting, which is called heap sorting. However, there are better sorting algorithms than heap, so we will not introduce its application in sorting.
The Java container has a class PriorityQueue, which indicates the priority queue, which implements the heap. We will introduce it in detail in the next section. For the next two questions, how to use the heap to efficiently solve them, we will use code in the following sections and explain in detail.
With so many advantages, what is the heap?
Concept of heap
Full Binary Tree
A heap is a binary tree first, but it is a complete binary tree. What Is A Complete Binary Tree? Let's look at another similar concept, full binary tree.
A full binary tree means that, except for the last layer, each node has two children, and the last layer is a leaf node, with no children. For example, both Binary Trees are full Binary Trees.
A full binary tree must be a complete binary tree, but a full binary tree does not require that the last layer be full. However, if not, all nodes must be concentrated on the leftmost and be continuous from left to right, there cannot be any blank space in the middle. For example, the following Binary Trees are full Binary Trees:
The following are not completely Binary Trees:
Number and Array Storage
In a Complete Binary Tree, you can give each node a serial number that increases continuously from 1, from top to bottom, from left to right, as shown in:
A Complete Binary Tree has an important feature. Given any node, you can quickly calculate its parent node and child node number based on its number. If the number is I, the parent node number is I/2, the left child number is 2 * I, and the right child number is 2 * I + 1. For example, for node 5, the parent node is 5/2, that is, 2, the left child is 2*5, that is, 10, and the right child is 2*5 + 1, that is, 11.
Why is this feature important? It enables the logical and conceptual binary tree to be easily stored in an array. The element index in the array corresponds to the node number, and the parent-child relationship in the tree is implicitly maintained through its index relationship, you do not need to keep them separately. For example, the logical Binary Tree in is saved to the array and its structure is:
Parent-child relationship is implicit. For example, for 5th element 13, the parent node is 2nd elements 15, the left child is 10th Elements 7, and the right child is 11th elements 4.
This method for storing Binary Trees is different from the previously introduced TreeMap. In TreeMap, there is a separate internal class Entry, which has three references, points to the parent node, left child, and right child respectively.
The advantage of using array storage is obvious, saving space and high access efficiency.
Max Heap/min heap
The heap logic is a Complete Binary Tree, while the physical storage uses arrays. In addition to these two points, the heap has certain sequence requirements.
I have previously introduced the binary sorting tree. The binary sorting tree is completely ordered. Each node has a definite precursor and successor, and there cannot be repeated elements.
Different from the binary sorting tree, duplicate elements can exist in the heap, and elements are not completely ordered. However, there are certain sequence requirements for parent and child nodes, which are divided into two types of heap according to the sequence, one is the largest heap, and the other is the smallest heap.
The maximum heap means that each node is not greater than its parent node. In this way, each parent node must be no less than all its child nodes, and the root node is the largest among all nodes, the root of the subtree is also the largest among all nodes of the subtree.
The minimum heap is the opposite of the maximum heap. Each node is not smaller than its parent node. In this way, each parent node must not be greater than all its child nodes, and the root node is the smallest of all nodes, the root of the subtree is also the smallest of all nodes in the subtree.
We can see:
Heap concept Summary
In summary, in terms of logic, the heap is a complete binary tree. There is a specific sequence between parent and child nodes, which can be divided into the maximum heap and the minimum heap. The maximum heap root is the largest and the minimum heap root is the smallest, the heap uses Arrays for physical storage.
Why can this data structure effectively solve the problems we mentioned earlier? Before answering this question, we need to first take a look at how to perform basic data operations on the stack and how to keep the attributes of the stack unchanged during the operation.
Heap Algorithm
Next, let's take a look at how to perform basic data operations on the stack. The maximum heap and minimum heap algorithms are similar. We use the minimum heap algorithm. First, let's see how to add elements.
Add Element
If the heap is empty, add a root directly. We assume that there is already a heap. We need to add elements to it. The basic steps are as follows:
Let's look at an example. The following is the initial structure:
Add element 3. After step 1, the structure changes:
3 is smaller than the parent node 8 and does not meet the minimum heap nature. Therefore, switching with the parent node will become:
After the switch, 3 is less than 6 of the parent node, so the switch will change:
After the switch, 3 is less than the parent node and also the root node 4. Continue the switch and change:
At this time, the adjustment is over, and the tree maintains the heap nature.
From the above process, we can see that to add an element, the maximum number of comparisons and exchanges is the tree height, that is, log2 (N), and N is the number of nodes.
This kind of self-low comparison and exchange makes the tree resatisfy the heap nature. We call it siftup.
Delete element from Header
In the queue, elements are usually deleted from the header. in Java, the priority queue is implemented using the heap. Let's take a look at how to delete the header in the heap. The basic steps are as follows:
Let's look at an example. The following is the initial structure:
Perform the first step, replace the header with the last element, and change:
Now the root node 16 is smaller than the child node, and is replaced with the smaller child node 6. The structure will change:
16 or smaller than the child node, and exchange with the Child 8 smaller, the structure will change:
In this case, the heap is satisfied.
Delete an element from the center
What if you need to delete a node from the center? Like deleting from the header, the elements to be deleted are replaced with the last element. However, if the element is greater than a child node, it needs to be adjusted downward (siftdown). Otherwise, if the element is smaller than the parent node, it needs to be adjusted upwards (siftup ).
Let's take a look at an example. The first step is to delete a node with a value of 21, as shown in:
After replacement, 6 has no child nodes and is smaller than 12 of the parent node. Execute the siftup upward adjustment process and the final result is:
Let's take a look at an example to delete a node with a value of 9, as shown in the first step:
After the switch, 11 is less than 10 for the right child, so the siftdown process is executed. After the switch is executed, it is:
Build the initial heap
How can we make an unordered array a minimum heap? The process of changing a normal unordered array into a heapify is called heapify.
The basic idea is to adjust siftdown from the last non-leaf node to the root node. In other words, it is from the bottom up. First, each minimum subtree is heap, and then the left and right subtree and its parent node are merged to a larger heap, because each subtree is already a heap, the adjustment is to execute siftdown on the parent node, and then merge and adjust until the root node. The pseudocode of this algorithm is:
void heapify() { for (int i=size/2; i >= 1; i--) siftdown(i);}
Size indicates the number of nodes. The node number starts from 1. size/2 indicates the number of the first non-leaf node.
The time efficiency of this construction is O (N), and N is the number of nodes, which is not proved.
Search and traverse
There is no special algorithm for searching in the heap, that is, finding the end from the array header with the efficiency of O (N ).
The traversal in the heap is similar. The heap is an array, and the heap traversal is an array traversal. The first element is the maximum or minimum value, but the subsequent elements do not have a specific order.
It should be noted that if elements are deleted from the header one by one, the heap can ensure that the output is ordered.
Algorithm Summary
The above are the main algorithms for heap operations:
- When adding and deleting elements, there are two key processes to maintain the nature of the heap. One is siftup and the other is siftdown ), their efficiency is O (log2 (N )). Heapify is a bottom-up and upward loop process in which the heap is built from an unordered array. The efficiency is O (N ).
- Search and traversal are the search and traversal of arrays, with the efficiency of O (N ).
Summary
This section describes the basic concepts and algorithms of the heap data structure.
Heap is a magical data structure. It is a tree in concept and is stored as an array. The parent and child have a special sequence. The root value is the maximum value/minimum value, which makes building, adding, and deleting highly efficient, it can efficiently solve many problems.
But in Java, how is the heap implemented? What are the problems mentioned at the beginning of this article? Let's continue exploring in the following sections.
---------------
For more information, see the latest article. Please pay attention to the Public Account "lauma says programming" (scan the QR code below), from entry to advanced, ma and you explore the essence of Java programming and computer technology. Retain All copyrights with original intent.