Use Python to demonstrate examples of dynamic rules to solve overlapping subproblems.

Last Update:2015-04-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Dynamic Planning is an algorithm strategy used to solve the problem of defining a state space. These problems can be broken down into new subproblems with their own parameters. To solve them, we must search for this state space and evaluate each step in decision-making. This technology will not waste time solving overlapping sub-problems thanks to the fact that such problems have a large number of identical states.

As we can see, it will also lead to a lot of recursion, which is usually interesting.

To illustrate this algorithm strategy, I will use a very interesting question as an example. This question is the 14th Challenge in Tuenti Challenge #4 in a recent programming competition.
Train Empire

We are faced with a Board Game called Train Empire ). In this case, you must plan the most efficient route for the train to transport the freight car at each railway station. The rule is simple:

Each station has a truck waiting for delivery to another station.
When each van is delivered to a destination, it will reward players with scores. Trucks can be placed at any station.
A train runs only on a single route. A freight car can be loaded at a time, because the fuel is limited to a certain distance.

We can beautify the original picture of our problem. To earn the highest score under the fuel limit, we need to know where the truck is loaded and where it is unloaded.

We can see in the picture that we have two train routes: red and blue. The station is located at some coordinate points, so we can easily calculate the distance between them. Each station has a van named after its endpoint, and the score reward we get when we deliver it successfully.

Now, let us assume that our truck can run 3 km yuan. The train on the Red Route can send the train at Station A to its end E (5 points), while the train on the blue route can transport the van C (10 points ), then the shipping truck B (5 points ). You can get the highest score of 20.
Status

We call the location of the train, the distance of the train, and the freight table of each station a problematic state. We still get the same problem by changing these values, but the parameter has changed. We can see that every time we move a train, our problem evolves into a different subproblem. To work out the best moving scheme, we must traverse these States and make decisions based on these States. Let's get started.

We will start with defining the train route. Because these routes are not straight lines, the graph is the best representation method.

import mathfrom decimal import Decimalfrom collections import namedtuple, defaultdict class TrainRoute:   def __init__(self, start, connections):    self.start = start     self.E = defaultdict(set)    self.stations = set()    for u, v in connections:      self.E[u].add(v)      self.E[v].add(u)      self.stations.add(u)      self.stations.add(v)   def next_stations(self, u):    if u not in self.E:      return    yield from self.E[u]   def fuel(self, u, v):    x = abs(u.pos[0] - v.pos[0])    y = abs(u.pos[1] - v.pos[1])    return Decimal(math.sqrt(x * x + y * y))

The TrainRoute class implements a very basic directed graph. It uses vertices as a collection of stations and stores the connections between stations in a dictionary. Note that we have added the (u, v) and (v, u) sides because the train can move forward and backward.

There is an interesting thing in the next_stations method. Here I use a cool Python 3 feature yield from. This allows a generator to be delegated to another generator or iterator. Because every station is mapped to a collection of stations, we only need to iterate on it.

Let's take a look at main class:

TrainWagon = namedtuple('TrainWagon', ('dest', 'value'))TrainStation = namedtuple('TrainStation', ('name', 'pos', 'wagons')) class TrainEmpire:   def __init__(self, fuel, stations, routes):    self.fuel = fuel    self.stations = self._build_stations(stations)    self.routes = self._build_routes(routes)   def _build_stations(self, station_lines):    # ...   def _build_routes(self, route_lines):    # ...   def maximum_route_score(self, route):     def score(state):      return sum(w.value for (w, s) in state.wgs if w.dest == s.name)     def wagon_choices(state, t):      # ...     def delivered(state):      # ...     def next_states(state):      # ...     def backtrack(state):      # ...     # ...   def maximum_score(self):    return sum(self.maximum_route_score(r) for r in self.routes)

I omitted some code, but we can see something interesting. The two name tuples will help keep our data neat and simple. Main class has the longest distance, fuel, route, and station that our train can run. The maximum_score method is used to calculate the sum of scores of each route, which becomes an interface for solving the problem. Therefore, we have:

A main class holds the connection between the route and the station.
A station tuples containing names, locations, and the list of existing trucks
A freight car with a value and a destination station

Dynamic Planning

I have tried to explain the key to how to efficiently search for state space in dynamic planning and make optimal decisions based on existing States. We have a state space that defines the location of a train, the remaining fuel of the train, and the location of each truck-so we can already express the initial state.

We must now consider every decision at each station. Should we load a truck and send it to our destination? What if we find a more valuable freight car at the next station? Should we send it back or move it forward? Or is it still not moving with a truck?

Obviously, the answer to these questions is the one that gives us more scores. To obtain the answer, we must obtain the values of the previous and next states in all possible circumstances. Of course, we use the score Function to calculate the value of each State.

def maximum_score(self):  return sum(self.maximum_route_score(r) for r in self.routes) State = namedtuple('State', ('s', 'f', 'wgs')) wgs = set()for s in route.stations:  for w in s.wagons:    wgs.add((w, s))initial = State(route.start, self.fuel, tuple(wgs))

There are several options for starting from each status: either move the truck to the next station or move it without a truck. It will not enter a new state because nothing has changed. If the current station has multiple trucks, moving one of them will go into a different state.

def wagon_choices(state, t):  yield state.wgs # not moving wagons is an option too   wgs = set(state.wgs)  other_wagons = {(w, s) for (w, s) in wgs if s != state.s}  state_wagons = wgs - other_wagons  for (w, s) in state_wagons:    parked = state_wagons - {(w, s)}    twgs = other_wagons | parked | {(w, t)}    yield tuple(twgs) def delivered(state):  return all(w.dest == s.name for (w, s) in state.wgs) def next_states(state):  if delivered(state):    return  for s in route.next_stations(state.s):    f = state.f - route.fuel(state.s, s)    if f < 0:      continue    for wgs in wagon_choices(state, s):      yield State(s, f, wgs)

Next_states is a generator that takes a status as a parameter and returns all the statuses that can be reached. Note how it stops when all trucks are moved to the destination, or it only enters the State where the fuel is still sufficient. The wagon_choices function may look a little complicated. In fact, it only returns a collection of trucks from the current station to the next station.

In this way, we have everything we need to implement the dynamic planning algorithm. We start to search for our decisions from the initial status, and then select the most effective policy. Look! The initial status changes to a different status! We are designing a recursive algorithm:

Get status
Computing our decisions
Make optimal decisions

Obviously, every next state will do the same thing. Our recursive function will stop when the fuel is exhausted or when all trucks are shipped to the destination.

max_score = {} def backtrack(state):  if state.f <= 0:    return state  choices = []  for s in next_states(state):    if s not in max_score:      max_score[s] = backtrack(s)    choices.append(max_score[s])  if not choices:    return state  return max(choices, key=lambda s: score(s)) max_score[initial] = backtrack(initial)return score(max_score[initial])

The last trap of completing the dynamic planning policy: in the code, you can see that I used a max_score dictionary, which actually caches every state of the algorithm. In this way, we will not repeat and traverse our previous State decisions over and over again.

When we search for status space, a station may arrive multiple times, some of which may lead to the same fuel, the same freight car. It doesn't matter how trains arrive here, but the decision made at that time has an impact. If we calculate the state once and save the result, we do not need to search this sub-space again.

If we do not use this memory technology, we will do a lot of identical searches. This usually makes it difficult for our algorithms to efficiently solve our problems.
Summary

Train Empire provides an excellent example to demonstrate how dynamic planning makes optimal decisions on problems with overlapping subproblems. The powerful expression ability of Python allows us to easily implement ideas and Write clear and efficient algorithms.

The complete code is in contest repository.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Use Python to demonstrate examples of dynamic rules to solve overlapping subproblems.

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support