Introduction to algorithms Chapter 6: heap sorting

Source: Internet
Author: User
Heap sorting is an in place sorting algorithm. Heap sorting also introduces another algorithm design technology, which uses a certain data structure to manage information in Algorithm Execution. Heap data structure is not only useful in sorting, but also can constitute an effective priority queue. 6.1 heap

The (Binary) Heap data structure is an array object, as shown in 6-1. It can be regarded as a Complete Binary Tree. each node in the tree corresponds to the element in the array that stores the node value. Each layer of the tree is filled, except for the last layer. Heap a has two attributes: length [a] is the number of elements in the array, and heap-size [a] is the number of] elements stored in heap. That is to say, although a [1... length [a] can contain valid values, all elements after a [heap-size [a] do not belong to the heap.

 

The root of the tree is a [1]. Given the subscript of a node, the parent node parent (I), the left son left (I), and the right son right (I) can be calculated simply:

Parent (I) = I/2; left (I) = 2I; right (I) = 2I + 1.

 

There are two types of binary heap: Maximum heap and minimum heap. In the smallest heap, except for any node I unexpected by the root node, there is a [Parent (I)]> = A [I]. The smallest heap is opposite to a [Parent (I)] <= A [I]. In the heap sorting algorithm, we use the largest heap. The smallest heap is usually used to construct a priority queue.

 

A heap can be regarded as a tree. The height of a node in the heap is defined as the number of the longest simple descent path from the current node to the leaf. The height of the heap is defined as the height of the root tree. Because the heap with n elements is based on a Complete Binary Tree, its height is lgn ).

 

The west part of this chapter provides some basic processes and describes how they are used in sorting algorithms and data structures of priority queues.

  • MAX-HEAPIFY, running at O (lgn)
  • BUID-MAX-HEAP, run in linear time, can be constructed on the basis of unordered input array Max heap
  • During the heap-Sort process, the running time is O (n * lgn), and an array is sorted in the same place.
  • MAX-HEAP-INSERT, HEAP-EXTRACT-MAX, heap-increase-key, and heap-maximum processes run at O (lgn), allowing the heap structure to be used as a priority queue

Exercise:

6.1.7 proof: WHEN n elements are stored in an array, the subscript of the leaf node is n/2 + 1, n/2 + 2,..., n

For subscripts K> = n/2 + 1, left (k)> N, right (k)> N, we can see that there are no subnodes, that is, these nodes are all leaf nodes.

 

 

6.2 maintain the nature of the heap

The input of the MAX-HEAPIFY is an array A and subscript I, assuming that the two subtree with left (I) and right (I) as the root is the largest heap, but at this time a [I] may be less than its children, the MAX-HEAPIFY let a [I] Drop, So that I as the root of the subtree become the largest heap.

 

MAX-HEAPIFY (A, I)

1 L <-- left (I)

2 r <-- right (I)

3 if l <= heap-size [a] and a [l]> A [I]

4 then largest <-- l

5 else largest <-- I

6 if r <= heap-size [a] and a [R]> A [Largest]

7 then largest <-- r

8 If largest! = I

9 then exchange a [I] <--> A [Largest]

10 MAX-HEAPIFY (A, largest)

 

The running time of the algorithm is to adjust the time of a [I], a [left (I)], a [rigoal (I)], and the time of recursive call. The former is called (1 ). The I Junction Tree is 2n/2 at most (the worst case occurs when the bottom layer is half full), then the Run Time of the MAX-HEAPIFY can be described below: T (N) <= T (2n/3) + hour (1), T (n) = O (nlgn ).

 

6.3 heap Building

We can use a MAX-HEAPIFY from the bottom up to turn an array a [1... n] into a max heap.

 

BUILD-MAX-HEAP)

 

1 heap-size [a] <-- length [A]

2 For I <-- length [a]/2 down to 1; refer to exercise 6.1.7

3 do MAX-HEAPIFY)

 

We can calculate a simple upper bound O (nlgn) of the algorithm ). In fact, we can get a more accurate boundary, because in the tree at different heights of nodes running MAX-HEAPIFY time is different, and most of the nodes are smaller in height. The previous O (n) can be obtained ).

PS: For detailed calculation methods, see the original book page76.

 

6.4 heap Sorting Algorithm

At the beginning, the heap sorting algorithm first uses the BUILD-MAX-HEAP to build the input array a [1... n] into a maximum heap. Because the maximum element in the array is in root a [1], you can swap it with a [n] to get the final correct position. Now, if node N is removed from the heap, it is easy to build a [1... n-1] into the largest heap. The child of the root is still the largest heap. In this process, the heap size is reduced from N to 2.

 

Heap-sort ()

1 BUILD-MAX-HEAP ()

2 For I <-- length [a] downto 2

3 do exchange a [1] <--> A [I]

4 heap-size [a] <-- heap-size [a]-1

5 MAX-HEAPIFY (A, 1)

 

Exercise:

6.4.5 proof: When all elements are different, the best running time of the heap sorting algorithm is Ω (nlgn ).

Analysis: the running time of heap sorting is composed of three parts, the BUID-MAX-HEAP time is O (n); the operation of the switching element executes n-1 times to O (N ); the MAX-HEAPIFY also executes n times, each time the MAX-HEAPIFY is executed, the root node is a smaller point, this is because the new root is always obtained by swapping the root of the largest heap and a [heap-size [a], so the execution time of the MAX-HEAPIFY must be Ω (lgn '), n' indicates the heap size. Therefore, the optimal running time is O (n) + Ω (Σ K = 2 ~ Nlgk) = Ω (lgn !)

 

PS: It can be seen from this question that heap sorting is a kind of cost-effective sorting algorithm that does not lead to poor or excellent sorting time due to special input.

 

 

6.5 priority queue

Although the heap sorting algorithm is a very beautiful algorithm, in practice, a good implementation of fast sorting is often better than that of heap sorting. In this case, the heap data structure is still very useful: one of them is as an efficient priority queue, there are also two maximum priority queues, the minimum priority queue.

 

A priority queue is a type of data structure used to maintain the set S composed of a family of elements. Each of these elements has a key keyword. A maximum priority queue supports the following operations:

Insert (S, X): inserts element x into the set S. This operation can be written to S <-- Su {x}

Maximum (s): returns the element with the largest keyword in S.

EXTRACT-MAX (s): removes and returns the element with the largest keyword in S.

Increase-key (s, X, k): increases the key value of element x to K, k> X.

 

An Application of the maximum priority queue schedules jobs on a time-sharing computer.

 

The smallest priority queue supports operations including insert, minimum, EXTRACT-MIN, and decrease-key. This queue can be used in event-driven simulators.

 

Heap-maximum ()

1 return a [1]

 

HEAP-EXTRACT-MAX)

1 If heap-size [a] <1

2 then error

3 max <-- A [1]

4 A [1] <-- A [heap-size [a]

5 heap-size [a] <-- heap-size [a]-1

6 MAX-HEAPIFY (A, 1)

7 return Max

 

Heap-increase-Key (A, I, key)

1 If key <A [I]

2 then error

3 A [I] <-- Key

4 while I> 1 and a [Parent (I)] <A [I]

5 do exchange a [I] <--> A [Parent (I)]

6 I <-- parent (I)

 

 

MAX-HEAP-INSERT (A, key)

1 heap-size [a] <-- heap-size [a] + 1

2 A [heap-size [a] <-- ∞

3 heap-increase-Key (A, heap-size [a], key)

The MAX-HEAP-INSERT first adds a node with a keyword value of-∞ to expand the maximum heap, then calls heap-increase-Key to set the correct value of the keyword of the new node, and maintains the maximum property.

 

Exercise:

6.5.8 please provide an algorithm with the time of O (nlgk) to combine K sorted linked lists into a sorted linked list.

Analysis: first, K linked lists are used as K nodes, and the value of the first element of the linked list is the key to form a minimum heap with a heap-size of K; then extract the first element of the linked list at the heap root and put it at the end of the target linked list. At this time, the key of the first element of the root changes, and then perform a MIN-HEAPFIY operation on it.

 

 

Questions:

6-1 create a heap using the insert method

The BUILD-MAX-HEAP process in section 6.3 can be achieved by repeatedly calling the MAX-HEAP-INSERT to insert each element into the heap, consider the following implementation:

BUILD-MAX-HEAP '()

1 heap-szie [a] <-- 1

2 For I <-- 2 to length [A]

3 do MAX-HEAP-INSERT (a, a [I])

 

A) is the heap produced by the process BUILD-MAX-HEAP and BUILD-MAX-HEAP always the same when the input array is the same?

Different. Inverse sample sequence 1, 2, 3.

 

B) Proof: In the worst case, the time complexity of BUILD-MAX-HEAP is nlgn.

Sigma Algorithm for loop MAX-HEAP-INSERT execution time is lgk, K = 2 ~ N. Therefore, the complexity of the entire algorithm is Σ (k = 2 ~ N) lgk = LG n! = Nlgn.

We can know that the time complexity of BUILD-MAX-HEAP is inferior to that of BUILD-MAX-HEAP, and the analysis process of a) shows that the former performs some unnecessary moving operations.

 

 

6-3 young's Matrix

The young family of m x n is a matrix of m x n. Data in each row is arranged from left to right, and data in each column is sorted from top to bottom. The young matrix may contain some ∞ data items, indicating that the elements do not exist. Therefore, the young matrix can be used to store a finite number of r <= Mn.

A) omitted

B) omitted

C) an algorithm for implementing EXTRACT-MIN on the young matrix of non-null m × n is given, and the running time is O (m + n ).

Analysis: to some extent, the young matrix is like a oblique least heap, which can be implemented by referring to the HEAP-EXTRACT-MIN algorithm. However, there is a difference: the young matrix does not need to maintain the size of the matrix is fixed, and the ∞ element is used to represent invalid elements, while the heap is linearly stored, there is a heap-size attribute to identify the number of valid elements in the heap. In this way, the young matrix does not have the concept of "tail element. Therefore, you can consider the heap-increase-key method. First, extract the Y [] element, then set it to ∞, and then adjust the Matrix to the young matrix through an algorithm. The latter can refer to the idea of MIN-HEAPIFY, Will y [] and Y [] and Y [] smaller, the problem is converted to m x (n-1) the adjustment of the sub-matrix Y (, m, n) or m-1) × n sub-matrix Y (, m, n) is so reciprocating.

First, the correctness of the adjustment algorithm is proved: A matrix except node [1, 1] satisfies the characteristics of the young matrix. Based on the above algorithm, we assume that Y [] and Y [] are exchanged. We can see that the original y [] <= Y [], after the switch, the first column still satisfies the order from top to bottom. In this case, the problem is transformed into an adjustment to the sub-matrix Y (, m, n). Assume that the adjustment to Y (, m, n) ends to make it a young proof, is the entire matrix A proof of young? The key is whether all rows are still ordered from left to right, that is, to prove the vector AK (1, N) [indicates the K row] is ordered because the sub-matrix Y (1, 2, m, n) is a young matrix, we can see that AK (2, n) is ordered, in the adjustment process, only one element is moved up or to the left, so in the adjustment process of the child matrix, we can see AK (2, n) the element in is only increased. In this way, we can see that AK (1, N) must be in order. Pass.

 

The algorithm pseudo code is as follows:

EXTRAT-YOUNG-MIN (Y, m, n)

1 x = Y [1, 1]

2 Y [1, 1] = ∞

3 youngify (Y, 1, 1, m, n)

4 return x

 

Youngify (Y, a, B, m, n)

1 Minx =;

2 miny = B;

3 if y [a, B + 1] <Y [Minx, miny]

4 then Minx =

5 miny = B + 1

6 if y [A + 1, B] <Y [Minx, miny]

7 then Minx = a + 1

8 miny = B

9 If minx! = A or miny! = B

10 then Y [a, B] <--> Y [Minx, miny]

11 youngifyy, Minx, miny, m, n)

 

 

Extension 1

D) how to search for element x in the young matrix.

Analysis: The first intuition is to start from the diagonal line, search along the diagonal line, and return if it finds it. Otherwise, the first position greater than X is found, which is assumed to be [a, B]. We can see that the elements in the Child matrix Y (, A-1, B-1) are less than X, the elements in the Child matrix Y (a, B, m, n) are greater than X. The problem is transformed into finding the child matrix Y (A, 1, m, B) and child matrix Y (1, B, A, N. The complexity of this algorithm should be greater than m + n.

 

In fact, you can search from the bottom right corner. first look at the location (M, 1). If y [M, 1]> X, it means the last row is larger than X, therefore, the problem is converted to the search for the sub-matrix Y (m-1, n). If y [M, 1] <X, the first column is smaller than X, therefore, the problem is transformed into finding the child matrix Y (1, 2, m, n. A comparison is visible, but the dimension of the matrix is reduced by 1. The load of the algorithm is m + n.

 

Extension 2

Can a heap search be made from its Heap Structure? I have no idea about this method.

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.