Original blog, reproduced please contact Bo Master!
Foreword: A few days ago, software Elite (algorithm outsourcing) Challenge race just ended, in fact, this is my second participation, but last year only shortlisted to the top 64 (32 strong is the rematch line), and finally got a hat (thanks to a sister paper courier sent over! ), this year a little more than a real, lucky to break into the leaderboard. (The 17th is our team, Oh!) Yeru is a great force! )
So, back to the point of first look at the preliminary contest!
Requirements for a preliminary competition question
Topology known to graph G (node V, Edge e) and a sub-figure V ', in G to find a path from the start node to the end node, requires all nodes through V ' and the entire path requires each node to pass only once, in the specified time, the lower the weight of the path obtained, the higher the algorithm score. (Start and end nodes are not in V ')
Of course, the official has provided a dedicated SDK (in fact, it is very troublesome to use it well?!) )
Why is this question a variant of the Hamiltonian path problem? First look at the Hamilton path Problem definition: ( Baidu Encyclopedia ) (wikipedia)
One of the sub-problems in this game is to find a Hamiltonian path in V, which is a NP-complete problem in itself, and the direct brute force solution leads to the exponential growth of the computational time, although dynamic programming using state compression can optimize the Hamiltonian path problem in time complexity. But no matter how in this problem, it is almost impossible to obtain the required path stably within 10 seconds as the scale of the graph increases.
So our approach to solving the problem has gone through the following changes:
Deep Search/Wide search pruning--Genetic algorithm + deep search--linear programming
The process of solving problems
First stage-deep search/wide search pruning
In fact, the solution of violent search + pruning in the case of small scale, performance will even exceed other complex algorithms:
The idea of recursion is very simple:
Rectraversalgraph:
For every Vertex ' V ' unvisited
Mark ' V ' as visited
Rectraversalgraph V
Unmark ' V ' as unvisited
End for
Of course the cost of traversing the graph directly is huge, so we add some pruning conditions:
1. The current traversal to the point is the end of the time, to determine whether the path conforms to test instructions, is the record, otherwise pruning.
2. When the cost of the current traverse is greater than the minimum cost for the external record, pruning.
3. When the cost of an edge that is currently traversed is greater than the minimum cost for the external record, The edge is deleted on the pro-link list.
In addition to pruning, we also need to simply compress the diagram:
1. The point with the degree of only 1 and the connected point merge to the same point, this should be easy to understand, not to repeat.
2. Remove duplicates from the same edges as the starting point and arrival points, preserving only the lowest cost edge.
Of course, the result of the submission is disappointing, only the first 5 test cases are solved, it seems that these primary use cases of the point size of not more than 100, but it is also expected to do.
Phase II-Genetic algorithm + deep Search
Here our algorithm refers to a few of the tsp problem of genetic algorithm to solve the paper , when we decided to use genetic algorithm, the official game environment is still dual-core, so I chose to have a natural parallel genetic algorithm to solve the problem, our problems, Then use the deep search to find the feasible/optimal solution.
The solution to our genetic algorithm for this problem is this:
Before we start we need a SPFA algorithm, parameters can specify some of the points that are excluded from the graph, using the Java feature Overload Object.clone ().
For every Vertex ' V ' in V '
Calculates the single-source shortest path of ' V '
End for
Here we get a | v| * | V ' | Shortest path cost table and one sheet | v| * | V ' | The Route record table (the path passes through the node order, essentially a spanning tree).
Then the genetic algorithm is started, here are three kinds of meta-operation, detailed introduction please refer to the paper links provided above.
1. mating Operation , for the TSP problem, our mating operation is not a universal k-opt, but a method called ' Sequential crossing '. This is roughly the case of the AB two path sequence as shown below:
|---A1---|---A2---|---A3---|
|---B1---|---B2---|---B3---|
Take random two cut points to get a and B 3*2 subsequence, Exchange A2 and B2 and do some processing, based on random numbers to solve the repetition, get two child sequences.
2. Point Exchange Operation
As the name implies, the mutation gets a subsequence after a number of points on a single sequence are exchanged by a random number.
3. Segment Exchange Operation
Similar to the above operation, that is, by exchanging a number of segments on a single sequence by a random number, the mutation gets the subsequence.
Enter the genetic algorithm iteration loop:
While not reaching approximate convergence
Roundrobin random number of shaking method
If match mating probability: mating and elimination of the inferior
elsif match point mutation probability: Perform point mutation operations and eliminate bad people
elsif matches segment mutation probability: Segment mutation operation and elimination of the inferior
else do nothing
End While
It is worth mentioning that we use Java features in the use of parallelism in addition to the Util.concurrent.Exchanger to allow two parallel groups between the number of iterations to exchange the best individuals.
After the genetic algorithm is over, we can get a better candidate, and then we need to use a deep search to find a viable solution from these known individuals. Because the genetic algorithm above is based on the shortest path cost table previously obtained, we need to find the shortest path in accordance with the shortest path of the overlapping edges, culling and then use the SPFA with culling edge to recalculate the shortest path. In this way, the problem of ring-forming has been solved. Plus the number of candidates can be controlled manually by us, so the calculation time of the whole deep search is controlled. The scale of the largest point in the preliminary race is 600 points, and the genetic algorithm will take about 6-7 seconds to complete the convergence, although the algorithm will perform well in the end of the genetic algorithm, but the problem appears in the back.
Because the test case diagram provided by the game is very sparse, the idea of eliminating a certain side to find a workable solution is naive, sure enough, our algorithm, although the time control is good, but the majority of the case of the solution is determined to be non-solution, especially for the only feasible solution is the best solution, Our algorithm is simply fragile. After the pain, we decided to use mathematical modeling + linear programming to solve this problem!
Phase III-Mathematical modeling + linear Programming
Our planning model has undergone a series of changes, and the optimization process is not much of a reference, but the final mathematical model is this:
1. Definition
E1 ... En for the edge from index 1...N, C1 ... cn for Edge E1 ... The weights corresponding to en.
The coefficient matrix is 01 matrix, the coefficient matrix 0 means not to select this edge, 1 is to select this edge, described as X1 ... Xn.
2. Constraint functions
For each point within V, the sum of the incoming edge coefficients is less than or equal to 1: that is, each point can have up to 1 incoming edges.
For each point within V, the sum of the margin coefficients is less than or equal to 1: that is, each point can only be made by one out edge.
For each point within V, the sum of the margin coefficients minus the sum of the edge coefficients equals 0: That is, the number of out edges and the number of incoming edges of each point is equal.
For each point within V, the sum of the margin factor is equal to 1: there must be an out edge.
For each point within V, the sum of the incoming coefficients is equal to 1: there must be an inward edge.
NO ring constraint: 1 <= ui-uj * xij*| v| <= | v| -1
3. Objective function
MINIMIZE (c1x1 + c2x2 + ... + cn-1xn-1 + cnxn)
At first we chose GLPK as the open Source Library to help us solve the problem, but we could not solve the most difficult two case of the official offer, resulting in a weak score. Later using Scip-soplex This open source library, the same model to solve this problem, plus SCIP provides some very good heuristics, resulting in our score all the way up.
The last thought locked down is this:
For small-scale case, use deep search + pruning.
For case in scale, use wide search + pruning.
For large-scale case, use a randomized heuristic search.
Of course, finally there are some optimizations are language level, such as array compression, dynamic distribution and so on, this article did not talk too much about the implementation of details, some people are interested in specific code and implementation details welcome message or contact me.
< path algorithm > Hamilton path variant