Preface:
I have been reading introduction to algorithms recently. The hard nut to crack of algorithms has always been a headache for me, especially the image algorithms. After a few days of hard thinking, although it cannot be said to be clear, it can be regarded as a small gain. After a little bit of thought, this article will be available. So the following content comes from my summary of my thinking process during my reading introduction to algorithms.
Let's get down to the truth. The theme I want to talk about today isSingle-source shortest path problem. For example, if you want to know how to go from Wuhan to Beijing, you can pick up your mobile phone and log on to Google
Maps: Enter the starting point Wuhan and ending point Beijing. Google will tell you the best choice. Let alone the difference in transportation means (if it is an airplane, it is estimated that you will fly directly following the Shortest Path of the line segment between the two points). We will think of every city on the map as a point, imagine a road from one city to another as a line. The question of how to go from Wuhan to Beijing is how to get to the destination through so many different lines. So how can we find the shortest route between two points on such a complex map?
Unfortunately, there is no simple solution to the problem of the shortest route from point A to point B on the map. We often need to find the shortest route from point A to all places, to determine the shortest route from A to B. This is a headache. Do you want to think about how to go to Guangzhou from Wuhan if we go to Beijing from Wuhan? Therefore, the current single-source shortest path algorithm is very clumsy in this situation. (In fact, this problem has been severely reduced by me. we can avoid a large number of repeated computations through simple pruning ).
Thoughts:
The most common algorithm for solving the single-source shortest path isDijkstra AlgorithmBefore thatBellman-Ford AlgorithmSpeaking. After all, the historical trajectory is so developed that the Bellman-Ford algorithm is first available, and then the improved Dijkstra algorithm is available. This Order also facilitates us to think about whether we can do better to solve the same problem.
Before entering the question, we first give a formal definition of the shortest path problem of the Order Source: Give a weighted directed graph G = (V, E), for any side (u, v) ε E, weighting function ω: e → R assigns an edge (u, v) a real value ω (u, v ). Calculate the shortest path from the given source point S to the remaining vertices in the graph.
In addition, the data structure is supplemented. In addition to storing the edges from vertex v to its adjacent vertex, the shortest paths from vertex s to vertex v are also stored. P = <s, V0, V1, ..., The path length of the forward vertex VK and P of V in VK, which are expressed by π [v] And d [v] respectively, that is, π [v] = VK, d [v] = ω (P ). For the shortest path from the source point to S to vertex v, we only need to recursively find its precursor vertex from vertex v, which is the shortest path when s ends. We use delta (S,
V) indicates the shortest path length from the Source Vertex s to the vertex v.
For this figure, we set the source point to S, and the shaded bold edges in the figure are all edges passing through the shortest path,
The number inside the vertex represents the shortest path length.
For example, the shortest path from S to Z is P = <S, T, Y, x, z>, and its shortest path length is delta (S, z) = 11.
The base of the Bellman-Ford algorithm isDynamic Planning. According to the introduction to algorithms, the following two features of dynamic planning are summarized:Optimal sub-structureAndOverlapping subproblems. We can prove that the shortest path can be dynamically planned.
The & Paste Technique proves that the shortest path problem satisfies the optimal sub-structure.
The shortest path is also the shortest path.: Assume P = <V1, V2 ,..., vn> is the shortest path from V1 to VN. For any I, j, where 1 ≤ I ≤ j ≤ n, in Path P, the sub-path from vertex VI to vertex VJ is also the shortest path from VI to vj.
We break down the Shortest Path P into V1 → VI → VJ → VN, and consider the path Q from VI to vj, if the path length of another path Q' is smaller than that of VI → VJ in the shortest path P, then we only need to replace this path Q' with the VI → VJ section in the original Shortest Path P to construct a shorter path. However, this is in conflict with the premise that p is the shortest path from V1 to VK. Therefore, the above conclusion is true. As for the overlapping sub-questions, we can better understand them. If we want the shortest path from V1 to VK, we also need to find the shortest path from V1 to VK in sequence,
V3,..., the shortest path of the vk-1, there will be overlapping subproblems in each process of solving.
Before explaining the Bellman-Ford Algorithm in detail, we should first introduce the triangle inequality. It is precisely by constantly determining whether the triangle inequality is true that we can accurately find the shortest path from the source point to each vertex in the graph.
Triangle Inequality: For any side (u, v) ε E, Delta (S, v) ≤ delta (S, U) + ω (u, v ).
We can easily prove that the triangle inequality is true. For the Shortest Path P from the source point S to the vertex v, if there is another path p 'that reaches V after the vertex u, And the P' path length is shorter than the original Shortest Path P, this is in conflict with the premise of the shortest path, so the inequality must be true. During Algorithm Execution, we always maintain the establishment of the triangle inequality, and constantly modify the values of d [v] and π [v] In vertex v in V, and then find the shortest path, we call this step loose (relax), and the following is the pseudocode of the loose step:
Relax(u, v, ω){ if d[v]> d[u] + ω(u, v) then d[v] ← d[u] + ω(u, v) π[v] ← u}
Bellman-Ford algorithm:
The next step is the Bellman-Ford algorithm. We determine that the number of edges from The Source Vertex s to the shortest path of any vertex in the graph will not exceed | v |-1 (where | v | is the number of vertices in the graph ), otherwise, a loop will appear. Starting from the number of edges experienced by the shortest path, starting from the source point S, the number of base-and-upwards constructed empirical edges is 1, 2 ,..., | v |-1 Shortest Path. At the end, the path from the Source Vertex s to any vertex in the graph is the shortest path. It seems that it is not difficult to understand. Next we will use mathematical induction to prove the correctness of this algorithm.
Number of edges of the shortest path starting from the Source Vertex s,
Basic Steps: When n = 1, the path from the Source Vertex s to its adjacent vertex is the shortest path with the number of edges being 1. Obviously, for any vertex in the graph except the Source Vertex V, it is either adjacent to the Source Vertex, and its path length is the weight of this edge ω (S, V ); the path length is ∞, indicating that the two vertices cannot reach each other ).
Induction steps: Assume that when n = K, we have constructed the Shortest Path starting from the Source Vertex s, where the number of experienced edges is 1, 2,..., K. Take such a path as P = <s, V1, V2,..., VK>, where D [VI] = delta (S, VI) and I ≤ k. If the shortest path from the Source Vertex s to the vertex VK + 1 is from P' = <s, V1, V2 ,..., VK, VK + 1>. After the relaxation side (VK, VK + 1), the path length of d [VK + 1] is delta (S,
VK + 1 ). If not, this is in conflict with the premise that path P is the shortest path from S to VK. Therefore, when n = k + 1, it can also start from the source point S, and the path with the number of experienced edges k + 1 is also the shortest path.
To sum up, after the shortest path with the number of experienced edges | v |-1 has been constructed, we have established the shortest path from the Source Vertex s to any vertex v in the graph.
Or, in a more general sense, we determine that there is such a shortest path for each vertex in the diagram from S to v. The path length is delta (S, v) = min {P: P is the reachable path from S to v. Otherwise, it cannot be reached. Its path length is ∞. If we make sure that each sub-path from S to V is a shortest path in the order of path P, we can finally get such a shortest path.
However, when I watch the MIT teaching visual screen, we often do not need to run it like this | v |-once to determine the single-source shortest path, on the one hand, the number of edges experienced by the shortest path established by all vertices in the graph cannot reach | v |-1; on the other hand, when the algorithm is executed, we do not just relax the I edge of the path from S to VI, but rather all the sides. The result is: We not only established the shortest path from S to VI, the shortest path from VI to vj is also established. Then we establish the shortest path from S to vj. The process after I may not have a substantial impact (in fact, it is an extra and redundant time overhead ).
In another way, let's assume that we can first establish the sequence of each vertex that goes through the shortest path from S to any vertex v, that is, for the shortest path P = <s, V1, v2 ,... VK>, We know V1, V2 ,..., which vertex does VK exactly correspond to, and the shortest path from the Source Vertex s to any vertex in the graph can be obtained through the relaxation of the path through the vertex order. This is like doing a job. If you know in the order in which you can save the most time and complete the job in this order, the minimum amount of time is required when the job is finished. What we need to do to establish such an order is only one topological sorting. In this way, the time complexity O (VE) of the Bellman-Ford algorithm can be reduced to O (V + E ).
Dijkstra algorithm:
Now is the time for our debut: Dijkstra algorithm. As the most widely used single-source shortest path algorithm, Dijkstra is not only efficient in execution, but also exquisite in program structure. The disadvantage is that the Dijkstra algorithm adoptedGreedy PolicyIn order to maintain the greedy choice property, the algorithm requires that the weights of all edges in the given weighted directed graph should not be less than 0. If an edge with a negative weight exists in the graph, the Dijkstra algorithm cannot ensure its correctness. However, if we focus on real life, we will find that there are almost no negative numbers in solving similar problems. For example, the distance between roads and streets cannot be negative. In a more demanding case, it is difficult to describe the actual model of an edge with a negative weight, although theoretically this exists. In addition, the Dijkstra algorithm can easily solve the single-source shortest path problem in practical applications.
The Dijkstra algorithm keeps adding vertices with the specified Shortest Path to S by maintaining a dynamic set of S. When S contains all vertices in the graph, the algorithm ends. As we mentioned above, the Dijkstra algorithm uses a greedy policy, because every time we select a vertex outside the set S, we always select a vertex with the minimum length of the current path and add it to S. By constantly selecting the vertex with the minimum length of the current path, we expect the path from the source point to each vertex to be the shortest path at the end of the algorithm. We still use mathematical induction to prove this fact.
N for the number of vertices in set S,
Basic Steps: When n = 1, we set the path length d [s] of the source point to 0 during initialization, and the path length of other vertices to ∞, so the unique vertex in S must be the Source Vertex S. Because there is no negative weight edge, Delta (S, S) = 0, and the Source Vertex s has indeed established the shortest path.
Induction steps: If n = K, we have established the shortest path of k vertices in S. Then we traverse the remaining vertices (that is, the set V-S) that are not in S, and select the vertex v with the minimum path length.
If the path length of vertex v is d [v] = ∞, all vertices in set S cannot reach vertex v. Because vertex V is the smallest vertex of the path length in the collection V-S, the path length of all vertices in the collection V-S is ∞, we can assert that from the source point S to all vertices in the V-S of the set are not reachable. For v'ε V-S, Delta (S, V') = ∞, we establish the shortest path of vertex v. When N = k + 1, the conclusion is still true.
If the path length of vertex v is d [v] ∞, it indicates that a path exists in the Set S to reach vertex v. For this path P = <s, V0 ,..., VK, V>. Let's split them into two parts: Assume p '= <s, V0 ,..., VI>, I ≤ k, is a path formed in the Set S, that is, each vertex passing through the path P belongs to the set S; then another path Q' = <VI, vi + 1 ,..., VK, V> is a path formed in a collection S-V. Because vertex V is the vertex with the smallest path length in the Set V-S, it satisfies d [v] ≤ d [vi + 1]. Because there is no negative weight edge in the graph, we export d [v] ≥d [vi + 1] According to path P'. Combining the two, we get: d [v] = d [vi + 1], ω (Q') = 0, q' is undoubtedly the shortest path from vi + 1 to v. So at this time, whether it is V or VI + 1 (in fact, they are likely to be the same vertex), we can establish their shortest path. In particular, we have established the shortest path for vertex v, so the conclusion remains true when n = k + 1.
To sum up, when the set S contains all vertices in the graph, the shortest path from the source point S to any vertex v in V has been established.
Or, in a more general sense, why is global optimization guaranteed every time the vertex with the minimum length of the current path selected? In the final analysis, there is no negative weight edge in the picture. For vertices in the Set V-S, selecting the vertex v with the smallest path length means that there is no path smaller than the current path length to reach vertex v, otherwise the weight of that path is negative. Most of the improvements to the Dijkstra algorithm start with selecting the vertex with the minimum path length. The time complexity of the priority queue implemented by using the binary least heap is O (V + E)
* LGV). If the Fibonacci heap is used to optimize the time complexity of the priority queue to O (vlgv + E ). In addition, the original Dijkstra algorithm cannot be further optimized in the program structure.
In the above analysis, we have been avoiding the negative weight loop because if a negative weight loop exists in the given graph, we can keep repeating in this loop, obtain the desired length of any small path. In this case, there is no shortest path, and its path length can only be expressed with-∞. Therefore, the single-source shortest path is meaningless.
Postscript:
In fact, most of the above content is similar to the original book introduction to algorithms, and there are even deviations and omissions. My idea is to record the process of thinking like this and clarify my thoughts. After all, when you can understand a complicated thing, you will probably understand it yourself. Originally, I also wanted to paste the code for specific implementation. However, during the writing process, I found that if I did this, I would focus on the explanation, rather than my own thoughts, in addition, it is estimated that it is not good ). My goal is very simple. I hope to have a thorough understanding of the solution to the single-source shortest path, which will lay the foundation for future programmers.