The idea of optimizing the system is really hard to understand. So, my dad is in Xi'an, Shaanxi, my mom is in Hefei, Anhui, and my brother is in Shenzhen, I plan to fly to Beijing to play. My family members are relatively cost-effective. I plan to wait for each other at the airport and then take a taxi to my place.
I checked Qunar. There are many flights from Xi'an, Hefei, Shenzhen to Beijing one day. How can I minimize the total fare and wait for each other at the airport. Assume that a person waits for 1 minute, which is equivalent to 1 RMB, in this way, our goal is to minimize the cost of schedulcost = (father's ticket money + Mother's ticket money + brother's ticket money + father's wait minutes + Mother's wait minutes + brother's wait time, the waiting minutes may be 0 for the father, mother, or younger brother.
First define my family:
people = [('baba','XA'), ('mama','HF'), ('didi','sz')]
Dad sets out from Xi'an, his mother sets out from Hefei, and his younger brother sets out from Shenzhen;
Define flight information again:
flights = [('XA','BJ') : [('06:30', '08:30', 600), ('07:45', '10:15', 500)....],('HF','BJ') : [('06:45', '09:30', 900), ('07:25', '10:45', 800)....],('SZ','BJ') : [('06:25', '10:30', 1200), ('08:05', '12:10', 1300)....],]
Flights from Xi'an to Beijing are: 6: 30, arrive at, ticket 600 yuan, departure at, arrival at, ticket 500 yuan...
Flights from Hefei to Beijing start at, arrive at, and air tickets depart at 900 yuan. flights depart at, arrive at, and tickets depart at 800 yuan...
Flights from Shenzhen to Beijing are:, 1200, and 1300 yuan for air tickets;, and Yuan for air tickets...
Assume that the flight is:
sol = [2,2,1]
The second class from Xi'an to Beijing, the second class from Hefei to Beijing, and the first class from Xi'an to Beijing.
Here we can calculate the cost of the flight I set above:
def schedulecost(sol): totalprice=0 latestarrival=0 for d in range(len(sol)): origin=people[d][1] outbound=flights[(origin,'BJ')][int(sol[d])] totalprice+=outbound[2] if latestarrival<getminutes(outbound[1]): latestarrival=getminutes(outbound[1]) totalwait=0 for d in range(len(sol)): origin=people[d][1] outbound=flights[(origin,'BJ')][int(sol[d])] totalwait+=latestarrival-getminutes(outbound[1]) return totalprice+totalwait
Well understood, shedulecost calculates the total flight fare and total waiting time,
Calculate the latestarrival of the last flight arrival time.
Our goal is to find a combination of sol to minimize the cost of schedulecost;
Because there are many flights in a day and I have simplified the problem to a minimum, in reality, the combination of results will easily reach hundreds of millions,
If shedulecost is a little complicated, it will be very difficult to find the lowest cost solution. Let's look at several possible practices:
(1) random search
The simplest method is to randomly calculate a portion of the results, such as 1000 possible solutions. The final result is the minimum value in the calculation results. This algorithm is the simplest and the most obvious problem.
In the following implementation, the first parameter is the range of each possible result. For example, there are 100 flights a day from Xi'an to Beijing, and 80 flights a day from Hefei to Beijing, there are 150 flights a day from Shenzhen to Beijing, so domain is defined as domain = [(0,100), (0,150), ()]; the second parameter is the schedulecost function defined above;
def randomoptimize(domain,costf): best=999999999 bestr=None for i in range(0,1000): # Create a random solution r=[float(random.randint(domain[i][0],domain[i][1])) for i in range(len(domain))] # Get the cost cost=costf(r) # Compare it to the best one so far if cost<best: best=cost bestr=r return r
One thousand attempts may be a good part of the total solution, but after several attempts, you may get a similar solution;
The problem with the random search method is that the random search is leaping, and this calculation has nothing to do with the previous calculation, that is, the latest calculation does not take advantage of the previously discovered optimization solution, let's take a look at several improved algorithms:
(2) Simulated mountain crawling
The climbing method starts from a random solution and finds a better solution in the near solution. In this example, find all the arrangements compared to the initial random arrangement that can make a person take a flight earlier or later, the cost is calculated for each adjacent time schedule, and the lowest cost schedule will become a new problem. Repeat this process to know that no Adjacent Arrangement can improve costs:
def hillclimb(domain,costf): # Create a random solution sol=[random.randint(domain[i][0],domain[i][1]) for i in range(len(domain))] # Main loop while 1: # Create list of neighboring solutions neighbors=[] for j in range(len(domain)): # One away in each direction if sol[j]>domain[j][0]: neighbors.append(sol[0:j]+[sol[j]+1]+sol[j+1:]) if sol[j]<domain[j][1]: neighbors.append(sol[0:j]+[sol[j]-1]+sol[j+1:]) # See what the best solution amongst the neighbors is current=costf(sol) best=current for j in range(len(neighbors)): cost=costf(neighbors[j]) if cost<best: best=cost sol=neighbors[j] # If there's no improvement, then we've reached the top if best==current: break return sol
The reason for this method is that we walked down a Downhill Slope until the bottom of the slope. This method is very fast and the results are generally better than the random search method. However, there is also a problem. The assumption of this algorithm is that the optimal solution is at The Valley of the initial downhill position. The first random solution will affect the final effect. If the initial position of the slope Valley is not the lowest among all the slope valleys, the results of this algorithm will not get the optimal solution, so this algorithm can find the local optimal solution, rather than the global optimal solution.
(3) Simulated Annealing
Annealing refers to the process of slowly cooling the alloy after heating. A large number of atoms jump around due to excitation, and then gradually stabilized to a low level state.
Sometimes it is necessary to turn to a worse solution before finding the optimal solution. The annealing method starts with a random solution. A variable is used to indicate the temperature. The temperature increases at the beginning and then decreases slowly. During each iteration, the algorithm randomly selects a number in the solution and changes in a certain direction. The difference between this algorithm and the climbing method is. Each iteration of the climbing method will develop towards the lowest cost, but the annealing method is not necessarily. There is a certain proportion to select a worse solution as the current iteration problem, this is an attempt to avoid local optimal solutions.
At the beginning, the temperature will be relatively high, and there will be a higher probability of turning to a worse solution. As iteration continues, the temperature will become lower and lower, and the algorithm will become increasingly unacceptable.
def annealingoptimize(domain,costf,T=10000.0,cool=0.95,step=1): # Initialize the values randomly vec=[float(random.randint(domain[i][0],domain[i][1])) for i in range(len(domain))] while T>0.1: # Choose one of the indices i=random.randint(0,len(domain)-1) # Choose a direction to change it dir=random.randint(-step,step) # Create a new list with one of the values changed vecb=vec[:] vecb[i]+=dir if vecb[i]<domain[i][0]: vecb[i]=domain[i][0] elif vecb[i]>domain[i][1]: vecb[i]=domain[i][1] # Calculate the current cost and the new cost ea=costf(vec) eb=costf(vecb) p=pow(math.e,(-eb-ea)/T) # Is it better, or does it make the probability # cutoff? if (eb<ea or random.random()<p): vec=vecb # Decrease the temperature T=T*cool return vec
T is the initial temperature, cool is the cool-down rate, and step is the pace of each iteration.
(4) Genetic Algorithm
This algorithm first randomly generates a set of results, and we become a population. In each iteration of optimization, the algorithm calculates the cost function of the entire population to obtain an ordered list of question solutions. After sorting the questions, we will create the next generation of population: first, the current population is located at the top of the population to join the new population. This is an elite selection, the other members of the new population are the variations of the modified optimal solution.
There are two methods for variation. One is to make slight and random changes to the optimal solution. In this example, we randomly select a number from the optimal solution and then progressively decrease it. Another method is called crossover or pairing. This method selects two solutions in the optimal solution and then combines them in a certain way. In this example, a simple method to implement crossover is to randomly retrieve a number from a solution as an element of the new solution, and the remaining element comes from another solution.
A new population is constructed by randomly mutating and pairing the optimal solution. Its size is usually the same as that of the old population. Later, this process will be repeated. A new population is sorted, and another population is constructed. The entire process ends if the number of specified iterations is reached or the result is not improved after several consecutive iterations.
def geneticoptimize(domain,costf,popsize=50,step=1, mutprob=0.2,elite=0.2,maxiter=100): # Mutation Operation def mutate(vec): i=random.randint(0,len(domain)-1) if random.random()<0.5 and vec[i]>domain[i][0]: return vec[0:i]+[vec[i]-step]+vec[i+1:] elif vec[i]<domain[i][1]: return vec[0:i]+[vec[i]+step]+vec[i+1:] # Crossover Operation def crossover(r1,r2): i=random.randint(1,len(domain)-2) return r1[0:i]+r2[i:] # Build the initial population pop=[] for i in range(popsize): vec=[random.randint(domain[i][0],domain[i][1]) for i in range(len(domain))] pop.append(vec) # How many winners from each generation? topelite=int(elite*popsize) # Main loop for i in range(maxiter): scores=[(costf(v),v) for v in pop] scores.sort() ranked=[v for (s,v) in scores] # Start with the pure winners pop=ranked[0:topelite] # Add mutated and bred forms of the winners while len(pop)<popsize: if random.random()<mutprob: # Mutation c=random.randint(0,topelite) pop.append(mutate(ranked[c])) else: # Crossover c1=random.randint(0,topelite) c2=random.randint(0,topelite) pop.append(crossover(ranked[c1],ranked[c2])) # Print current best score print scores[0][0] return scores[0][1]
Popsize refers to the population size. mutprob refers to the probability that new members are muttered or crossed. elite is worth being the part of the population that can be inherited to the next generation, the maximum number of maxiter iterations.
The code is still very simple. Wait for the next time to get more data, and I will try it...