Common technologies of efficient algorithms (Introduction to algorithms)

Source: Internet
Author: User

For efficient algorithms, some simple technologies, such as divide and conquer, randomization, and recursive solutions.
Here we will introduce more complex technologies, dynamic planning, and greedy algorithms.

When designing algorithms for complex problems, you will first consider usingDivide and conquer LawTo solve, divide and conquer, a very philosophical idea, and then complex problems can be constantly decomposed into the granularity that you can easily solve, After all simple problems are solved, after the combination, the solution to the complex problem is obtained. We can see the typical recursive solution ideas. requirements for the use of the division and Control Law, eachSub-problems are independent(That is, it does not contain public subproblems,Subissues do not overlap).
If sub-problems overlap, it is inefficient to use the division and control method, because the same sub-problems need to be solved repeatedly, which leads to algorithm redundancy and dynamic planning should be considered.

Dynamic PlanningThe essence isDivide governance ThoughtsAndSolving RedundancyTherefore, dynamic planning is a way to break down problematic instances into smaller and similar subproblems, and store the solutions to subproblems to avoid repeated computing issues, to solve the optimization problem. To solve the optimal problem, we need to generate a global optimal solution. This optimal solution needs to consider the solutions of all subproblems. You can imagine this recursive tree, each layer needs to consider the solution of subproblems from the lower layers, so a large number of subproblems overlap. In this case, dynamic planning is required.

In the same way, solving the optimal problem often means that local optimization can represent global optimization (this is a special case of general optimization). At this time, dynamic planning is no problem, common algorithms can certainly be used in a particular example, but they are inefficient.
For such special cases, you can useGreedy AlgorithmThe current greedy method may depend on all the options that have been made,Independent of the choices and sub-problems to be made. Therefore, the greedy method is acceptable.Top-downMake greedy choices (local optimization) step by step to solve the global optimization problem.

Dynamic planning must beBottom-upBecause its optimal solution must depend on the subproblem solution, it must first solve the subproblem. although you can choose the top-down idea or the bottom-up idea when coding, the final execution is from the bottom up.

Dynamic Planning

 Dynamic Planning is used for subproblem overlapping and saves the subproblem solution to avoid repeated calculation of subproblems, which greatly improves the algorithm efficiency, it can be said that dynamic planning is an optimization algorithm that takes space for time.
Two factors are suitable for the optimal problem of dynamic planning,Optimal sub-structureAndOverlapping subproblems

For dynamic planning, the most important and difficult step is to find the optimal sub-structure. Others are very simple?
In other words, we divide the problem into subproblems and prove that when we can obtain the optimal solution of the subproblem, we can obtain the global optimal solution through the combination of the optimal solution of the subproblem or simple selection.
The introduction to algorithms provides steps to find the optimal sub-structure. I personally think that step does not really help you easily find the optimal sub-problem.
In fact, this can only be said, but it is an art to find the best sub-problem ......

Dynamic Planning AlgorithmRunning timeDepends on the product of two factors,The total number of subproblems and the number of choices for each subproblem

I don't understand. Let's take a few examples to see how to use dynamic planning.

I. Matrix concatenation
A * B * C * D * E * f * g for a matrix concatenation problem, the multiplication sequence greatly affects the efficiency. Therefore, it is very valuable to find the optimal sequence.
Optimal subproblem,
Assume that the matrix (1 .... n) concatenation problem, split into subproblems, (1... k), (k + 1, n) (1 <k <n) Two concatenation subproblems, if you can know their optimal solution, can you know the global optimal solution?
When K is traversed from 1 to n and the cost of each division is calculated separately, we can obtain an optimal K value, which is the global optimal solution.

Running time, The total number of sub-problems is N2, and each sub-problem must face n choices, such as the K above, so the running time is at O (N3)

After finding the optimal sub-problem, the others are simple. For the first example, the complete process is provided.
Set the length of matrix concatenation to N. For 1 <= I, we use M (I, j) to represent the minimum consumption of matrix concatenation from I to J.
M (I, j) = min (M (I, K) + M (k + 1, J) + sub-matrix multiplication cost)Where I <= K
You can use bottom-up or top-down to solve the dynamic planning problem:
Let's take the matrix concatenation problem as an example to explain two ideas.
Bottom-up ideas:
This algorithm is provided in the Introduction to algorithms.
Matrix_chain_order (P)
Create two secondary tables
M (I, j) is used to record the minimum consumption from the I matrix to the J matrix multiplication.
S (I, j) records the K value that minimizes consumption
N = length (P)-1
For l substring 2 to n l represents the length of the sub-chain. For example, if the length of P is 6, the length of the sub-chain ranges from 2 to 5.
Do For I iterate 1 to n-L + 1 to traverse all possible sub-chains. For example, if the length of P is 6 and the length of sub-chains is 5, the possible sub-chains are 1 ~ 5, 2 ~ 6
Do J then I + L-1 find the end of the sub-chain
M [I, j] ∞
For k then I to J-1 try to divide this matrix multiplication problem with different K, find the optimal and save
Do Q 1_m [I, K] + M [k + 1, J] + pi-1 PK PJ
If q <m [I, j]
Then M [I, j] ← Q
S [I, j] ← K
Return m and S
This is the idea of bottom-up. It takes two iterations from the leeches and saves the values of M (I, j) that will be used repeatedly later. after 6 iterations, obtain the optimal K from S [I, j ].

Top-down idea: This idea is called an introduction to algorithms.Memorandum (memoization)
I wrote it myself.
Create global M [I, j], s [I, j] and clear 0
Matrix_chain (p, I, j)
If M [I, j]> 0 does not need to be repeated after calculation
Return M [I, j]
N = J-1
M [I, j] ∞
For k then I to N try to divide this matrix by different K
Do Q merge matrix_chain (P, I, K) + matrix_chain (P, K + 1, J) + pi-1 PK PJ
If q <m [I, j]
Then M [I, j] ← Q
S [I, j] ← K
Return M [I, j]
This is the top-down idea. Call matrix_chain (P, 1, length (p.

Ii. assembly line Scheduling

The colonel automobile company produces cars in factories with two assembly lines. After a chassis enters each assembly line, different components are installed on the chassis of each Assembly station, the finished car leaves the end of the assembly line. There are n assembly stations on each assembly line, numbered j = 1, 2 ,..., n. Represent the J assembly station of assembly line I (I is 1 or 2) As S (I, j ). The J-station S (1, J) of assembly line 1 and J-station S (2, j) of Assembly Line 2 perform the same functions. However, these assembly stations are built at different times and adopt different technologies. Therefore, the time required for completing the Assembly on each station is different, this is true even for the assembly stations in the same position on the two assembly lines. The assembly time required for each assembly station is recorded as a (I, j), and the time required for the chassis to enter assembly line I is E (I ), the time required to exit assembly line I is X (I ). Under normal circumstances, the time taken to move the chassis from the previous station of an assembly line to the next station can be ignored, but occasionally, the unfinished chassis will be moved from one station in one assembly line to the next stop in another assembly line, for example, in case of an emergency order. Assume that the time taken to remove the assembly line I from the chassis of the assembly station S (I, j) is T (I, j ), the problem now is to determine which stations are selected in assembly line 1 and which stations are selected in assembly line 2 to minimize the total time the car passes through the factory.
This problem is described in a complex way, with a diagram...

Optimal subproblem,
Now, the optimal route is required for N assembly stations to achieve the minimum pass time. In fact, this sub-problem is still very easy to find. Simply scale down the scale and consider n-1 stations.
Suppose we know the optimal route of n-1 stations, can we find the optimal route of N stations? The answer is yes, because if the optimal route for N stations A does not contain the optimal route for n-1 stations B, we can use B to replace the n-1-1 station route in a to achieve better performance, this is in conflict with. after obtaining the optimal route for n-1 stations, we can find the global optimal solution by taking n-1 to n into account.

Running time, The total number of sub-problems is N, and each sub-problem must face two choices, so the running time is at O (N)

3. No permission for the shortest path and no permission for the shortest path

Directed Graph G = (V, E), node u, v belongs to V
You do not have the permission to use the shortest path. The problem is that you can find a path with the least edge from u to v.
You do not have the permission for the longest simple path. Find a simple path with the most edges from u to V. It must be simple. Otherwise there is a loop and you can traverse it any time.

The shortest path is not authorized.
Optimal sub-structure
,
The idea of introducing algorithms is that for U, V, and U is not equal to V, it will certainly contain the intermediate node W, and set the optimal distance of U-> W to P1, if the optimal distance of W-> V is P2, then the Shortest Path passing through W must be P1 + P22. so long as we consider all the middle nodes W, we can find the global optimal.
My ideas are as follows:
There are many ways to choose from Beijing to Guangzhou. Which one is the fastest?
From Beijing to Guangzhou, 1... n cities are required. Beijing and K cities are connected using C (R) 1 <= r <= K
F (I, j) represents the shortest path from city I to city J.
F (1, N) = min (f (1, C (R) + f (C (R), n) 1 <= r <= K
This idea is stratified. First, we assume that we know the optimal route from Beijing to all cities connected to Guangzhou, and then we can find the optimal global path with the last site selection.
Running time, It should be O (N3)

You do not have the maximum permission for simple paths.
This provides an example of why dynamic planning cannot be used.
The introduction to algorithms once again uses the phrase "sub-problem independence", which is very misleading. It is not a concept as mentioned in the previous section.
The previous independence refers to whether the subproblems overlap.
This means that sub-problems are contradictory. To put it bluntly, they cannot be resolved by sub-problems, because sub-problems are mutually dependent and some resources are shared, if you have used this node, I can no longer use it.
Therefore, dynamic planning cannot be used. In fact, it cannot be solved by the Division and Control Law. This problem cannot be solved by division...

3. Longest Common subsequence

 Problem description: The subsequence of a character sequence is a character sequence formed by removing a number of characters (either one or not) from a given Character Sequence at Will (not necessarily consecutive. Make the given character sequence X = "x0, X1 ,..., Xm-1 ", sequence y =" y0, Y1 ,..., Yk-1 is a subsequence of X, there is a strictly incrementing subscript sequence of x <I0, i1 ,..., Ik-1>, so that all J = 0, 1 ,..., K-1, with xi = YJ. For example, x = "abcbdab" and Y = "bcdb" are subsequences of X.
This is mostly used for searching biological and DNA links.

Optimal sub-structure,

The problem is, Suppose X = {x1 ,..., xm}, y = {y1 ,..., yn} and their eldest son sequence z = {Z1 ,..., ZK}
The idea is to narrow down the problem to build a sub-structure,
1) if XM = YN, that is, XM = YN = ZK, {x1 ,..., xm-1} and Y = {y1 ,..., you can obtain the global solution by using the longest-growth sequence of YN-1 }.
2) if they are not equal, max ({x1 ,..., xm} and Y = {y1 ,..., YN-1}, {x1 ,..., xm-1} and Y = {y1 ,..., yn}) is the global optimal solution.

Running time, It should be O (N)

4. Optimal Binary Search Tree

Problem description: an ordered sequence K = {k1 <k2 <K3 <,......, <KN} and the probability P = {P1, P2, P3 ,......, Pn}, requires that a binary search tree T be constructed to minimize the total cost of querying all elements.
How to calculate the total cost of a query? calculate the probability that each node's height * is queried, and then sum the sum to get the total cost. therefore, nodes with high probability must be placed near the root as much as possible, and the features of Binary trees must be maintained.
The example in the book is: translation. English translation into French requires constant query through Binary Trees. If one-step binary tree is used, the efficiency will be relatively low, for example, a frequently-used word like the one may have a very high height, and logn is required to find it each time. This is not suitable, so it has such a data structure.

Optimal sub-structure,

This is similar to the subproblem of Matrix Multiplication. For the global optimal search tree, any node may be the root node. When the node KK is the root node, the problem is divided into (K1... kk) and (KK + 1, kN) optimal solution. therefore, you only need to consider all nodes to find the global optimal solution.
Running time, It should be O (N3)

To sum up, the most difficult problem for dynamic planning is to find the optimal sub-structure. For the above analysis examples, find two types of optimal sub-structure ideas, one type of typical thinking is like matrix concatenation, and the optimal binary search tree to split the problem set. another kind of thinking is like assembly line problems and longest common subsequence problems, gradually narrowing down the problem set.

Greedy Algorithm

Greedy algorithms are the best or optimal (that is, the most favorable) choice in the current state in each step of selection, so as to make the result the best or the best algorithm. For example, if a traveler chooses the nearest city every time, this is a greedy algorithm.

For most of the problems, the greedy method usually cannot find the optimal solution (however, there are also exceptions such as finding the Minimum Spanning Tree in the graph and finding the Harman encoding ), because they generally do not test all possible solutions. The greedy method is easy to make decisions too early, so it cannot reach the optimal solution. For example, all greedy methods known for graph coloring and all greedy methods for NP-complete problems cannot guarantee the optimal solution. However, the advantage of the greedy method is that it is easy to design and can achieve good approximate solutions in many cases.

As mentioned above, greedy algorithms are a special case of universal optimization algorithms. That is to say, problems that can be solved by greedy algorithms can also be solved by dynamic planning, but they cannot be solved by the contrary, the following are two examples.

I. Selection of activities
Problem description: a set of N activities, E = {1, 2 ,..., N}. Each activity requires the use of the same resource, such as the speech venue. Only one activity can use this resource within the same time period. Each activity I has a start time si that requires the use of the resource and an end time Fi, and Si <fi. If activity I is selected, it occupies resources within the half-open time range [Si, FI. If the interval [Si, FI) and interval [SJ, fj) do not overlap, it is said that activity I is compatible with activity J. That is to say, when Si ≥ fi or SJ ≥ FJ, activity I is compatible with activity J. The most compatible sub-set is selected in the given activity set.

Let's take a look at the use of dynamic planning. The optimal sub-structure is that if the largest set contains activity K, we can divide the problem into two subproblems (1, K), (k + 1, n) if you know the optimal solution of these two subproblems, you can find the global optimal solution by considering all K conditions. here K may be any activity from 1 to n.

In fact, K does not need to consider the situations from 1 to n, but only the earliest activity at the end of the activity, it can be proved that the largest compatible activity subset must contain the earliest activity at the end of the activity, because if it does not contain it, we can use it to replace the first activity in the set, and we can obtain the optimal solution. therefore, we can see that the greedy method is a special case. When the dynamic planning algorithm needs to consider n options, it only needs to consider the current optimal choice. in addition, the solution of subproblems must be considered in the selection of dynamic planning. Therefore, it must start from the bottom up and solve subproblems first. greedy algorithms can solve this problem from top to bottom simply based on the current situation.

It is easy to solve the issue of activity selection by using greedy algorithms. The minimum heap is built based on the activity end time, and the activities with the minimum pop activity end time are continuously performed to determine whether they are compatible, compatible to the largest set of compatible activity child.

Ii. Backpack Problems
0-1 backpack Problems: N items and a backpack. The Weight of item I is W, the value is V, and the size of the backpack is C. How should I select the items in the backpack to maximize the total value of the items in the backpack? When you choose to attach a backpack to an item, you only have two options for each item I: attach a backpack or not attach a backpack. You cannot attach item I to a backpack multiple times or just part of item I.

Some backpack Problems: Similar to a 0-1 backpack, the difference is that you can select a part of item I when selecting item I to load the backpack, rather than loading all the items into the backpack.

This group of questions is a very good example to illustrate when greedy algorithms should be used and when dynamic planning should be used.
When you can prove that local optimization is the global optimization, the greedy algorithm can solve the optimal problem, such as the previous activity selection problem.
For the 0-1 backpack problem, you can try to use the greedy algorithm to first load the most valuable, but this is not necessarily the best, because space may be wasted. when considering whether to add an item to a backpack, We must compare the solution of the sub-problem that this item is added to with the solution of the sub-problem that does not take this item, so typical dynamic planning.
For some backpacks, there is no problem. You can use the greedy algorithm to pick the most expensive package, and the local optimization is the global optimization.

2. Heman Encoding

Heman encoding is a widely used and effective data compression technology. Generally, it uses the same number of digits for all characters, such as ASCII code and 8 bit. however, he-man uses less-digit encoding for high-frequency words, while the low-frequency words use higher-digit encoding to compress data. See the example below.

For variable-length encoding, the prefix encoding technology is used. No one encoding is the prefix of another encoding. He man designed a greedy algorithm that can be used to construct an optimal prefix called Heman encoding. The algorithm is very simple. It always combines the minimum two nodes into the subtree. This is the greedy algorithm, with the Heman tree, top-down decoding and bottom-up encoding.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.