Example of using Python to demonstrate dynamic rule method to solve overlapping sub-problems

Source: Internet
Author: User
Dynamic programming is an algorithmic strategy to solve the problem of defining a state space. These problems can be decomposed into new sub-problems, and the sub-problems have their own parameters. To solve them, we have to search for this state space and evaluate it at each step as a decision. This technique does not waste time on solving overlapping sub-problems, thanks to the fact that such problems can have a large number of identical states.

As we can see, it also leads to a lot of recursion, which is usually interesting.

To illustrate this algorithm strategy, I will use a very interesting question as an example, this question is the 14th challenge in the Tuenti Challenge #4 in a programming contest I have recently participated in.
Train Empire

We are faced with a board game called Train Empire (Board games). In this problem, you have to plan out the most efficient route for the train to transport the trucks at each station. The rules are simple:

    • Each station has a lorry waiting to be transported to another station.
    • Each truck is sent to the destination to reward the player with some points. Trucks can be placed at any station.
    • The train runs on only one single route, one truck at a time, because the fuel is limited to a certain distance.

We can beautify the original picture of our problem. In order to win the maximum score under the fuel limit, we need to know where the truck is loaded and where it is unloaded.

As we can see in the picture, we have two train routes: Red and blue. The station is located at some coordinate points, so it's easy to figure out the distance between them. Each station has a truck named after its end, as well as a score reward that we can get when we reach it successfully.

Now, suppose our truck can run 3-kilometer miles. The train on the red route can send a station train to its end E (5 points), the blue route on the train can transport the truck C (10 points), and then transport truck B (5 points). Can achieve a maximum score of 20 points.
Status indication

We call the location of the train, the distance from the train, and the truck form of each station as a problem state. Changing these values is still the same problem, but the parameters are changed. We can see that every time we move a train, our problems evolve into a different sub-problem. To figure out the best move scenario, we have to traverse these states and make decisions based on those States. Let's start putting.

We will start by defining the train route. Because these routes are not straight lines, the diagram is the best representation.

Import mathfrom Decimal Import decimalfrom Collections Import Namedtuple, Defaultdict class Trainroute:   def __init__ ( Self, start, connections):    Self.start = Start self     . E = Defaultdict (set)    self.stations = set ()    for u, v in connections: Self      . E[u].add (v) Self      . E[v].add (U)      self.stations.add (U)      self.stations.add (v)   def next_stations (self, u):    if u isn't in self. E:      return    yield from self. E[u]   def fuel (self, U, v):    x = ABS (U.pos[0]-v.pos[0])    y = ABS (u.pos[1]-v.pos[1])    return Decimal (ma TH.SQRT (x * x + y * y))

The Trainroute class implements a very basic graph, which takes the vertex as a station in a set, and the connection between stations exists in a dictionary. Please note that we have added (U, v) and (V, u) two sides, because the train can move forward backwards.

There is an interesting thing in the Next_stations method, where I used a cool Python 3 feature yield from. This allows a generator to be delegated to another generator or iterator. Because each station is mapped to a collection of stations, we just need to iterate over it.

Let's take a look at the main class:

Trainwagon = namedtuple (' Trainwagon ', (' dest ', ' value ')) Trainstation = Namedtuple (' trainstation ', (' Name ', ' pos ', ' Wagons ') class Trainempire:   def __init__ (self, fuel, stations, routes):    self.fuel = fuel    self.stations = Self._build_stations (stations)    self.routes = self._build_routes (routes)   def _build_stations (self, station_ Lines):    # ...   def _build_routes (self, Route_lines):    # ...   def maximum_route_score (self, route):     def score (state):      return sum (W.value for (W, s) in State.wgs if w.dest = = s . Name)     def wagon_choices (state, T):      # ...     Def delivered (state):      # ...     def next_states (state):      # ...     def backtrack (state):      # ...     # ...   def maximum_score (self):    return sum (Self.maximum_route_score (R) for R in Self.routes)

I omitted some of the code, but we can see something interesting. Two named tuples will help keep our data neat and simple. Main class has the longest distances our trains can run, fuel, and routes as well as these parameters of the station. The Maximum_score method calculates the sum of the scores for each route, which becomes the interface to solve the problem, so we have:

    • A main class holds the connection between the route and the station
    • A station tuple with a list of names, locations, and existing wagons
    • A van with a value and a destination station.

Dynamic planning

I've tried to explain how dynamic planning can efficiently search for the key to state space, and make optimal decisions based on existing state. We have a state space that defines the location of the train, the remaining fuel of the train, and the position of each van-so we can already represent the initial state.

We must now consider every decision at each station. Should we load a van and send it to the destination? What if we find a more valuable van at the next station? Should we send it back or move forward? Or don't you move with the van?

Obviously, the answer to these questions is the one that can make us get more points. To get the answer, we have to find the value of the previous state and the next state in all possible situations. Of course we use the score function to find the value of each state.

def maximum_score (self):  return sum (Self.maximum_route_score (R) to R in self.routes) state = namedtuple (' state ', (' S ', ' f ', ' WGS ') WGs = set () for S in route.stations:  s.wagons:    Wgs.add ((w, s)) initial = State (Route.start , Self.fuel, tuple (WGS))

There are several options for each state: either take the van to the next station or move without a van. Staying still does not enter a new state, because nothing has changed. If there are multiple lorries in the current station, moving one of them will enter a different state.

def wagon_choices (state, T):  yield State.wgs # Not moving wagons was an option too   WGS = set (STATE.WGS)  Other_w Agons = {(W, s) for (W, s) in WGS if s! = state.s}  state_wagons = Wgs-other_wagons  for (W, s) in state_wagons:
  parked = State_wagons-{(w, s)}    Twgs = other_wagons | parked | {(W, t)}    Yield Tuple (TWGS) def delivered (state):  return all (W.dest = = S.name for (W, s) in State.wgs) def next_states (state): 
  if delivered (state):    return  for S in Route.next_stations (STATE.S):    f = state.f-route.fuel (STATE.S, s)    if f < 0:      continue for    WGS on wagon_choices (state, s):      yield state (S, F, WGS)

Next_states is a generator that takes a state as a parameter and then returns all the states that the state can reach. Notice how it stops after all the wagons have moved to the destination, or it only goes into those fuels that are still sufficient. The Wagon_choices function may seem a little complicated, but it simply returns the collection of lorries that can be used from the current station to the next station.

So we have everything we need to implement the dynamic programming algorithm. We start by searching our decisions from the initial state and then choosing a most strategic one. See! The initial state will evolve into a different state, and the state will evolve into a different state! What we are designing is a recursive algorithm:

    • Get status
    • To calculate our decisions
    • Make optimal decisions

Obviously each next state will do the same thing in this series. Our recursive function will stop when the fuel is exhausted or all lorries are transported to the destination.

Max_score = {} def backtrack (state):  if state.f <= 0:    return state  choices = [] for  s in Next_states (St ATE):    if s not in Max_score:      max_score[s] = Backtrack (s)    choices.append (Max_score[s])  if not choices:< C9/>return State  return Max (choices, Key=lambda S:score (s)) max_score[initial] = Backtrack (initial) return score ( Max_score[initial])

The last pitfall in completing the Dynamic planning strategy: In the code, you can see that I used a max_score dictionary, which actually caches every state that the algorithm goes through. This way we will not iterate over and over again our decisions that we have already experienced.

When we search the state space, a station may arrive several times, some of which may lead to the same fuel, the same truck. It doesn't matter how the train gets here, only the decisions made at that time have an impact. If we had calculated that state once and saved the results, we would not need to search this subspace again.

If we do not use this kind of memory technology, we will do a lot of exactly the same search. This usually results in our algorithm being difficult to solve our problems efficiently.
Summarize

Train Empire provides an excellent example of how dynamic planning can make optimal decisions with overlapping sub-problems. Once again, Python's powerful expressive power allows us to simply implement ideas and write clear and efficient algorithms.

The complete code is contest repository.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.