Introduction to algorithms Chapter 2: Introduction to Algorithms

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Document directory

Divide and conquer Law
Analysis of divide and conquer Law

This chapter introduces a framework throughout this book. The algorithm design and analysis in subsequent chapters are carried out in this framework. First, we analyzed how to use insert sorting to solve the Sorting Problem and defined a "pseudo code" to describe the algorithm. After the algorithm is described, it is proved that the task can be completed correctly and the running time is analyzed. A mark is introduced to indicate how the running time increases with the number of data items to be sorted. Then we will introduce the "division and control" in algorithm design, and use this method to design an algorithm called merging and sorting, and analyze the running time of the merge algorithm.

Insert Sorting Algorithm

The sorting problem is defined as follows:

Input: N count {A1,

A2 ,...,

Output: an array of input sequences {a' 1

, A' 1

,..., A' n

}, Making a' n

<='

N <=... <='

The pseudocode for inserting the Sorting Algorithm is provided in the form of an array parameter. The numbers entered are sorted in place, meaning that these numbers are re-ordered in array A. At any time, at most, only Changshu numbers are stored outside the array.

INSERTTION-SORT)

1 For j <-- 2 to length [A]

2 do key <-- A [J]

3 I <-- J-1

4 while I> 0 and a [I]> key

5 do a [I + 1] <-A [I]

6 I <-- I-1

7 A [I + 1] <-- Key

Correctness of the loop-free formula and the Insertion Algorithm

The Loop Invariant formula is mainly used to help us understand the correctness of the algorithm. For the Loop Invariant formula, there must be three properties:

Initialization: it should be correct before the first iteration of the loop starts.

Persistence: if it is correct before an iteration of a loop starts, it should be correct before the next iteration starts.

Stop: When the loop ends, the variant gives a useful property, which helps to indicate that the algorithm is correct.

PS: For for a for statement, "before the first iteration starts" indicates that after the initialization assignment and condition check, "before the next start" refers to the auto-increment expression and condition check. A loop refers to the Code executed between the next condition check after the condition check (the first round also includes initialization.

The principle of the non-variable cycle is similar to the mathematical induction.

Now, the correctness of the sorting algorithm is proved by the first round of the ring non-variant formula: A [1... j-1] is an array that contains 1st to J-1 elements and is sorted.

Initialization: before the first cycle body, j = 2, then a [J-1] contains only one element, and the element has not been moved, unchanged;

Persistence: Insert a [J] into a [1... j-1] suitable position, J increase 1, (here we do not discuss the double cycle of the variant), at this time the variant is still true.

Abort: When the loop is aborted, j = length [a] + 1 is carried into the non-variant formula, which just proves the correctness of the algorithm.

PS: the conditions for cyclic interruption are combined with the non-variant formula to prove the correctness of the algorithm.

It is recommended that the cycle is not variable to prove the correctness of the second cycle, the purpose of the second cycle is to find a value-1 <= I <= J-1, placing the key in a [I + 1] causes a [0... j] ordered. Here we can think that after executing "Key <-- A [J]", a [J] is null, that is, a [1. J] contains only J-1 elements. Similarly, the code "A [I + 1] <-- A [I]" will also leave a [I] empty. The variant is: (1) a [1... i] ordered; (2) a [I + 2, J] ordered and all elements are not smaller than the key, and a [I + 2... all elements in J] are not less than a [1... any element in I]; (3) A [I + 1] is an idle position.

Initial: I = J-1, a [J] is left blank, coupled with the non-variant formula of the outer repeating cycle, condition (1) is true, a [I + 2 .. j] contains 0 elements, so (2) is also true. (3) it is also obvious.

Hold: The loop body transfers the value a [I] to a [I + 1], and the I decreases by 1. The condition (1) is obviously true. Before the loop is executed, a [I]> key, a [I] is a [0... the largest element in I], so after execution, condition (2) is still true; condition (3) is obviously true.

Abort: When the loop is aborted, if I =-1, then the time and space value at a [0], a [1... j] Order (condition 2), and a [1... j] If all elements are greater than the key, placing the key into a [0] causes a [0... j] is ordered. If a [I] <= key, a [0... i] Order, so

The key is greater than a [0... all elements in I], and because a [I + 2, J] is ordered and all elements are greater than the key, putting the key into a [I + 1] will make a [0... j] ordered.

PS: since the second cycle is only an auxiliary process, its non-variant pattern is rather abstract and obscure. A simple and clear process like this does not need to be proved without changing the style. Here, we will try it for you to use it without changing the style.

Algorithm Analysis

Algorithm analysis predicts the resources required by an algorithm. Memory, communication bandwidth, or computer hardware resources are occasionally related to us, but usually the computing time we want to measure. The running time of an algorithm refers to the basic operands executed during a specific input.

The analysis algorithm should establish a model for implementation technology, including a model describing the resources used and their costs. This book uses a general single processor, Random Access Machine (RAM) computing model as the implementation technology. The ram model contains common commands in real computers. The time required for each command is constant. It is also assumed that each word in the ram model has a maximum length limit. Exponent operation 2N

When N is small, it can be seen as the constant execution time. The ram model does not consider the Memory Hierarchy and does not model the cache and virtual memory.

In general, the time required by an algorithm increases synchronously with the input size, so the running time of a program is often expressed as its input function. The concept of input scale is related to specific problems. For many questions, the most natural measure is the number of elements in the input. For other problems, such as the multiplication of two integers, the Optimal measurement of the input scale is the number of digits in the binary representation of the input number. Sometimes, two numbers indicate that the input scale is more suitable. For example, if the input is a graph, the input scale can be expressed by the number of vertices and edges in the graph.

It is assumed that each line of code takes a constant time Ci

To calculate the number of times each line of code is executed in the inserted sorting algorithm, and an expression of the Algorithm Execution time is provided. (For details, refer to the original book page14 ~ 15 ).

The number of times the code is executed to insert the second loop of the sorting algorithm depends on the input feature-"order degree". In the best case (input order), the second loop is not executed at all, the Algorithm Execution time can be expressed as an + B. In the worst case (input backward), the algorithm execution time can be expressed as an2.

+ B.

Generally, the execution time of the "worst case" algorithm is obtained because the execution time in the worst case is reached, and we grasp the upper limit of the Algorithm Execution time, do not worry that the algorithm will exceed this time in some cases; for some algorithms, the worst case may be more frequent, such as queries, when the queried object does not exist, the worst case may occur. The "average condition" is often the same as the worst case. In the insert sorting algorithm, assume that at each insert, A [0... j-1] has more than half of the elements a [I], the algorithm execution time is still a quadratic function of N.

To simplify the analysis, we will make a further image -- the growth rate of the running time or the magnitude of the growth. We will only consider the highest level of the running time expression -- N2

. When the input scale N is small, it may be wrong to identify the algorithm efficiency by magnitude, but when n is large, a N2

Compared with N3

The algorithm runs faster.

Algorithm Design

There are many ways to design the algorithm. Insert sorting uses the increment method: in the sorting array a [1... after the J-1], insert a [J] to form the sorted array a [1... j]. This chapter introduces the "divide and conquer law ".

Divide and conquer Law

Many algorithms are recursive in structure. To solve a given problem, algorithms call themselves recursively once or multiple times to solve related subproblems, these algorithms usually adopt the partitioning strategy: divide the original problem into N subproblems with smaller sizes and similar structures as the original problem; solve these subproblems recursively and then merge the results, the problem is solved.

There are three steps in recursion of each layer:

Decomposition: the original problem is broken down into some column subproblems;

Solution: recursive jiejie's sub-problems. If the sub-problems are small enough, evaluate them directly;

Merge: Merge the sub-problem results into the solution of the original problem.

Based on this mode, the merge and sort operations are as follows:

Decomposition: divides n elements into subsequences containing n/2 elements.

Solution: Use the Merge Sorting method to recursively sort two subsequences

Merge: merge two sorted subsequences to obtain the sorting result.

The pseudocode for merging and sorting is provided below. The secondary process is merge (A, P, Q, R), which sorts the sorted sub-array a [p .. q] And a [q + 1... r] merged into an ordered sub-array a [p... r]:

MERG (A, P, Q, R)

N1 <-- q-p + 1

N2 <-- r-Q

Create arrays L [1... N1 + 1] and R [1... N2 + 1]

For I <-- 1 to N1

Do l [I] = A [p + I-1]

For I <-- I to N2

Do R [I] = A [q + I]

L [N1 + 1] = maximum sentry Element

R [n2 + 1] = maximum sentry Element

I <-- 1

J <-- 1

For k <-- p to R

Do if l [I] <= R [J]

Then a [k] = L [I]

I ++

Else a [k] = R [J]

J ++

Merge-sort (A, P, R)

If P <r

Then q <-- (p + r)/2

Merge-sort (A, p, q)

Merge-sort (A, q + 1, R)

Merge (A, P, Q, R)

PS: The meanings of merge and merge-sort are clear at a glance. However, the code in authoritative books is worth imitating, including the use of the Sentinel element.

Analysis of divide and conquer Law

The running time of the recursive call algorithm can be expressed by a recursive equation. Recursion in the divide and conquer algorithm is based on three steps in the basic mode. Assume that the scale of the original problem is N, and the original problem is divided into a sub-problem. The scale of each sub-problem is one of B. Note that A and B are sometimes equal, but in many cases, they are not equal. The algorithm running time can be expressed as follows:

T (n) = at (N/B) + d (n) + C (n); t (n) is a constant when n is small enough; D (N) the time required to divide the problem. C (n) indicates the time required to merge the subproblem results.

Merge Sorting algorithms. A = 2, B = 2, D (n) is a constant, and the magnitude of C (n) is 1.

The above formula can be expressed as T (n) = 2 T (n/2) + hour (n) + hour (1) = 2 T (n/2) + limit (n ).

The main theorem in Chapter 4 proves T (n) = random (N ㏒ N ). It can also be proved by recursive tree, see the original book Page21-22.

PS: WHEN n is not an even number, the scale of the two subproblems decomposed is not completely equal. Here we assume that N is 2, so that the decomposition of each layer can be consistent, chapter 4 proves that this assumption does not affect analysis.

Exercise

2-4 reverse order Pairs

Set a [1... n] to an array containing n different numbers. If I <J has a [I]> A [J], (I, j) is called a reverse-order pair in.

(1) list the five reverse orders of the array {2, 3, 8, 6, 1.

(2) If the elements of the array are taken from {1, 2..., n}, what sort of array contains the largest reverse order?

(3) What is the relationship between the insertion sorting time and the number of reverse pairs in the input array?

(4) An algorithm is provided to determine the number of your reverse-order pairs in any arrangement of n elements using the worst running time of Nth (N ㏒ N. (Tip: Modify the merge order)

Answer:

(1) omitted

(2) reverse Array

(3) There is a linear positive correlation between the insertion sorting time and the number of backward pairs in the input array. By observing the inserted sort algorithm pseudo code, we can see that the algorithm running steps mainly depend on the number of elements moving in the inner loop, and each moving means that the number of elements in the array in reverse order is reduced by one, when the sorting ends, the number of reverse orders is zero.

(4) based on the division and control method, if we break down the array into two subsequences, find the reverse numbers of the two subsequences respectively, and then find the reverse numbers of elements between the two subsequences, then we can get the reverse Number of the entire array. Consider the following:

Decomposition: divides the problem into two arrays with n/2 scales.

Solution: Calculate the reverse logarithm respectively. If the subproblem scale is 2 or 1, you can solve it directly.

Merge: although we know the reverse logarithm of the two subsequences, the reverse logarithm of the two subsequences cannot be easily known. If we compare them by two, the time complexity of the merge operation is N2.
The division and control law is meaningless.

Consider the above "merge" problem. If the two sub-sequences are in order at this time, the number of reverse orders between sub-sequences can be obtained by modifying the MERG process of merging and sorting: when MERG selects the first element of two subsequences, if the first element of the previous sequence is selected, the number of reverse orders remains unchanged-this element does not constitute a reverse order with the remaining elements in the next sequence. If the first element of the second sequence is selected, then, the number of elements remaining in the first sequence is increased. This element and each element remaining in the previous sequence form a reverse order, and these reverse order pairs are eliminated after MERG. According to this idea, the division algorithm is re-designed as follows:

Decomposition: divides the problem into two arrays with n/2 scales.

Solution: Perform recursive merge and sort respectively, and record the reverse logarithm eliminated by Accumulative sort. If the subproblem scale is 2 or 1, you can solve it directly.

Merge: merge by merging the sorted MERG. During the MERG process, the number of reverse orders is accumulated according to the preceding method.

PS: In the initial consideration of Issue (4) with the Division and Control Law, the sorting function is not so obvious at the beginning. However, through the analysis of "merge, it is required that the side effects of "sorting" be produced for solving subproblems. This "side effect" is worth noting in the sub-governance method.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Introduction to algorithms Chapter 2: Introduction to Algorithms

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Introduction to algorithms Chapter 2: Introduction to Algorithms

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support