Example of using Python to show dynamic rules to solve overlapping sub-problems _python

Source: Internet
Author: User
Tags abs generator

Dynamic programming is an algorithm strategy to solve the problem that defines a state space. These problems can be decomposed into new child problems, and child problems have their own parameters. To solve them, we have to search this state space and evaluate each step when making a decision. Thanks to the fact that this type of problem has a lot of the same state, this technology will not waste time on solving overlapping child problems.

As we've seen, it can also lead to a lot of use of recursion, which is often interesting.

To illustrate this algorithm strategy, I will use a very interesting question as an example, which is the 14th challenge in the Tuenti Challenge #4 in a programming competition that I recently participated in.
Train Empire

We are faced with a board game called Train Empire (Board Game). In this problem, you must plan a most efficient route for the train to transport the van at each train station. The rules are simple:

    • Every station has a lorry waiting to be transported to another station.
    • Each truck is sent to the destination to reward players with some points. The van can be placed at any station.
    • The train runs only on a single route and can install a van at a time, because the fuel is limited and can only move a certain distance.

We can beautify the original picture of our problem. To get the maximum score under the fuel limit, we need to know where the truck is loaded and where to unload it.

As we can see in the picture, we have two train routes: Red and blue. The station is at some point in the coordinates, so it's easy to figure out the distance between them. Each station has a truck named after it's end, and when we successfully deliver it we can get a score reward.

Now, suppose our van can run 3-kilometer miles. Trains on the red route can send a train to its end point E (5 points), trains on the blue Route can deliver a freight truck C (10 points) and then deliver a van B (5 points). Can achieve a maximum score of 20 points.
Status Representation

We call the location of the train, the distance of the train, and the table of wagons for each station a problem state. Changing these values is still the same problem, but the parameters have changed. We can see that every time we move a train, our problems evolve into a different child problem. To figure out the best move scenario, we have to traverse these states and make decisions based on those States. Let's start putting.

We will start by defining the train route. Because these lines are not straight lines, graphs are the best way to represent them.

Import math
from decimal import decimal
from collections import Namedtuple, Defaultdict
 
class trainroute:< C4/>def __init__ (self, start, connections):
    Self.start = start
 
    self. E = Defaultdict (set)
    self.stations = set ()
    for u, v. in connections:
      self. E[u].add (v)
      self. E[v].add (U)
      self.stations.add (U)
      self.stations.add (v)
 
  def next_stations (self, u):
    if u not in Self. E:
      return
    yield from self. E[u]
 
  def fuel (self, U, v):
    x = ABS (U.pos[0]-v.pos[0])
    y = ABS (u.pos[1)-v.pos[1]) return
    Decimal (ma TH.SQRT (x * x + y * y))

The Trainroute class implements a very basic, forward graph, which takes the vertices as stations in a set and connects the stations to a dictionary. Please note that we have added (U, v) and (V, u) two edges, because the train can move forward backwards.

There's an interesting thing in the Next_stations method where I use a cool Python 3 feature yield from. This allows a generator to be delegated to another generator or iterator. Because each station is mapped to a collection of stations, we just have to iterate over it.

Let's take a look at main class:

 Trainwagon = Namedtuple (' Trainwagon ', (' dest ', ' value ')) Trainstation = Namedtuple (' Train Station ', (' Name ', ' pos ', ' wagons ') class Trainempire:def __init__ (self, fuel, stations, routes): Self.fuel = Fuel self.stations = self._build_stations (stations) Self.routes = Self._build_routes (routes) def _build_statio NS (self, station_lines): # ... def _build_routes (self, route_lines): # ... def maximum_route_score (self, R Oute): Def score (State): "Return sum" (W.value for (W. s) in State.wgs if w.dest = s.name) def wagon_choic ES (state, T): # ... def delivered (state): # ... def next_states (state): # ... def back Track (state): # ... def maximum_score (self): return sum (Self.maximum_route_score (R) for R in Sel f.routes) 

I've omitted some code, but we can see something interesting. The two named tuples will help keep our data neat and simple. Main class has the longest distance that our train can run, the fuel, and the route as well as the station parameters. The Maximum_score method calculates the sum of the scores of each route and becomes the interface for problem solving, so we have:

    • A main class holds the connection between the route and the station
    • A station tuple, with a list of names, locations and current wagons
    • A van with a value and destination station.

Dynamic programming

I have tried to explain how dynamic programming can efficiently search for the key to state space and make optimal decisions based on existing states. We have a state space that defines the location of the train, the fuel remaining on the train, and the location of each van-so we can already represent the initial state.

We must now consider every decision at every station. Should we load a van and send it to the destination? What if we find a more valuable van at the next station? Should we send it back or move it forward? or move without a truck?

Obviously, the answer to these questions is the one that will make us get more points. In order to get the answer, we have to find out the previous state and the value of the last state in all possible cases. Of course we use the score function to find the value of each state.

def maximum_score (self): return
  sum (Self.maximum_route_score (R) for R into self.routes) state
 
= Namedtuple (' State ', (' s ', ' f ', ' WGs ')
 
WGS = set () to
s in route.stations: to
  W in s.wagons:
    Wgs.add ((w, s))
INI Tial = State (Route.start, Self.fuel, tuple (WGS))

There are several options from each state: either move to the next station with a lorry or move without a truck. Staying still won't go into a new state because nothing has changed. If there are multiple lorries in the current station, moving one of them will enter a different state.

def wagon_choices (state, T):
  yield State.wgs # not moving wagons-an option too
 
  WGS = set (STATE.WGS)
  Other_w Agons = {(W, s) for (W. s) in WGs if S!= state.s}
  state_wagons = Wgs-other_wagons
  for (W. s) in state_wagons:
   parked = State_wagons-{(w, s)}
    Twgs = other_wagons | parked | {(W, t)}
    Yield tuple (TWGS)
 
def delivered (state): Return all
  (w.dest = S.name for (W. s) in State.wgs)
 
def next_state S (state):
  if Delivered (state): Return to
  s in Route.next_stations (STATE.S):
    f = state.f- Route.fuel (STATE.S, s)
    if f < 0:
      continue for
    WGS in Wagon_choices (state, s):
      yield state (S, F, WG S

Next_states is a generator that takes a state as a parameter and then returns all states that can be reached by this state. Note how it is that all lorries are moved to the destination after stopping, or that it only goes into those fuels that are still sufficient. The Wagon_choices function may look a bit complicated, but it simply returns to the collection of lorries that can go from the current station to the next station.

So we have all the things we need to implement the dynamic programming algorithm. We start from the initial state of the search for our decision, and then choose one of the most strategic. See! The initial state will evolve into a different state, and the state will evolve into a different state! What we're designing is a recursive algorithm:

    • Get status
    • To calculate our decision
    • Make the best decision

Obviously each next state will do the same thing for this series. Our recursive function will stop when the fuel is exhausted or all lorries are destined for delivery.

Max_score = {}
 
def backtrack (state):
  if state.f <= 0: Return state
  choices = [] for
  s in Next_State S (state):
    if isn't in Max_score:
      max_score[s] = Backtrack (s)
    choices.append (Max_score[s])
  if not Choices: Return state return
  Max (choices, Key=lambda S:score (s))
 
max_score[initial] = Backtrack ( Initial) return
score (max_score[initial])

The last pitfall to complete a dynamic planning strategy: In the code, you can see that I'm using a Max_score dictionary, which actually caches every state of the algorithm experience. So we don't iterate through our decisions about the state we've already experienced.

When we search the state space, a station may arrive several times, some of which may lead to the same fuel, the same van. It doesn't matter how the train gets here, only the decisions made at that time have an impact. If we have calculated that state once and saved the result, we will not need to search this subspace again.

If we hadn't used this memory technology, we would have done a lot of the exact same search. This often results in our algorithms having difficulty solving our problems efficiently.
Summary

Train Empire provides an excellent example of how dynamic programming can make optimal decisions with overlapping child problems. Python's powerful expressive power once again makes it easy to implement ideas and write clear and efficient algorithms.

The complete code is in contest repository.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.