By hisky)
This is my note when I look at Smart algorithms. I will post it for you to see. If you have any misunderstanding, please point out that John has thanked John for this. ^_^
For example
In engineering practice, there are often some "novel" algorithms or theories, such as simulated annealing, genetic algorithms, taboo searches, and neural networks. These algorithms or theories share some common characteristics (such as simulating natural processes), which are called "intelligent algorithms ". They are useful in solving complex engineering problems.
What do these algorithms mean? First, we provide a local search, simulated annealing, genetic algorithm, and Taboo Search image metaphor:
In order to find the highest mountain on the earth, a group of aspiring rabbits began to find a solution.
1. The rabbit jumped above the present. They found the highest mountain not far away. But this mountain is not necessarily Mount Everest. This is local search. It cannot guarantee that the local optimal value is the global optimal value.
2. The rabbit is drunk. He jumped randomly for a long time. During this period, it may go up or enter the ground. However, he gradually woke up and jumped in the highest direction. This is simulated annealing.
3. The rabbits took the amnesia pills and were shot into space, then randomly located somewhere on the earth. They do not know what their mission is. However, if you have killed some low-altitude rabbits in a few years, the productive rabbits will find Mount Everest themselves. This is the genetic algorithm.
4. rabbits know that the power of a rabbit is small. They told each other where the mountain was found and every mountain they looked for left a rabbit as a mark. They developed strategies for finding the next step. This is forbidden search.
Overview of Intelligent Optimization Algorithms
Intelligent Optimization Algorithms generally solve the optimization problem. Optimization problems can be divided into (1) function optimization problems for solving a function that minimizes the value of the function independent variable and (2) Finding the optimal solution in a solution space, optimize the combination of the minimum value of the target function. Typical Composite Optimization Problems include: Traveling Salesman Problem, TSP, scheduling problem, and knapsack problem ), and bin packing problem.
There are many optimization algorithms. Classic algorithms include linear planning and dynamic planning. Improved local search algorithms include the climbing method and the shortest descent method, the simulated annealing, genetic algorithms, and Taboo Search described in this article are referred to as guided search methods. However, neural networks and Chaotic Search belong to the dynamic evolution method of the system.
In the optimization idea, the neighborhood function is often mentioned. Its function is to point out how to obtain a new (Group) solution from the current solution. The specific implementation method should be determined based on the specific problem analysis.
In general, local search is based on the greedy idea to use the neighborhood function for search. If a better solution than the existing value is found, the former is discarded and the latter is obtained. However, it can only obtain the "local minimum solution", that is to say, the rabbit may climb to the "Mountain and small world", but it does not find Mount Everest. Simulated Annealing, genetic algorithms, taboo search, and neural networks have been improved from different perspectives and strategies to achieve a better global solution ".
Simulated Annealing (SA)
The simulated annealing algorithm is based on the similarity between the annealing process of solid matter and the problem of combined optimization. When the material is heated, the Brown motion between the particles is enhanced. After a certain intensity is reached, the solid material is converted to a liquid state. At this time, it is annealed, and the thermal motion of the particles is weakened and tends to be ordered gradually, the final result is stable.
The solution of simulated annealing no longer depends on the initial point for the final result as in local search. It introduces an acceptance probability p. If the target function f (pn) of the new vertex (set as pn) is better, p = 1 indicates selecting the new vertex; otherwise, the acceptance probability p is the current vertex (set as pc) the target function f (pc), the target function f (pn) of the new vertex, and another function that controls the parameter "temperature" T. That is to say, simulated annealing is not greedy for finding better points every time like a local search, and points that fall short of the target function may also be accepted. With the execution of the algorithm, the system temperature T gradually decreases and ends at a certain low temperature. at this temperature, the system no longer accepts changes.
The typical feature of simulated annealing is that in addition to the improvement of the target function, it also accepts an attenuation limit. When T is large, it accepts a large attenuation, and when T is gradually changed to an hour, accept small attenuation. When T is 0, the attenuation is no longer accepted. This feature means that simulated annealing is opposite to local search. It can avoid local minimums and maintain the versatility and simplicity of local search.
Physically, first heat, let the molecules collide with each other, turn into a disordered state, can increase, and then reduce the temperature, the final molecular order will be more orderly, the internal energy is smaller than before no heating. Just like the rabbit, when it gets drunk, it turns a blind eye to the closer Mountain and jumps in a dark circle, but is more likely to find Everest.
It is worth noting that when T is 0, simulated annealing becomes a special case of local search.
Pseudo Code expression for Simulated Annealing:
Procedure simulated annealing
Begin
T: = 0;
Initialize temperature T
Select a current string vc at random;
Evaluate vc;
Repeat
Repeat
Select a new string vn in the neighborhood of vc; (1)
If f (vc) <f (vn)
Then vc: = vn;
Else if random [0, 1] <exp (f (vn)-f (vc)/T) (2)
Then vc: = vn;
Until (termination-condition) (3)
T: = g (T, t); (4)
T: = t + 1;
Until (stop-criterion) (5)
End;
In the above program, the key is (1) New State generation function, (2) New State acceptance function, (3) Sampling stability criterion, (4) temperature rejection function, (5) The annealing termination criterion (three functions for short) is the main link that directly affects the optimization result. Although the experimental results show that the initial value has no effect on the final result, the higher the initial temperature, the higher the probability of obtaining high-quality solutions. Therefore, we should try to select a relatively high initial temperature.
The selection strategy for the above key links:
(1) State generation function: the reverse solution is determined by the neighborhood function of the current solution. It can be generated by interchange, insertion, and backward operations. Then, a new solution is selected based on the probability distribution method, the probability can be uniformly distributed, normally distributed, Gaussian distributed, and Gaussian distributed.
(2) State acceptance function: This step is the most critical, but experiments show that the acceptance function has little impact on the final result. Therefore, select min [1, exp (f (vn)-f (vc)/T)].
(3) Sampling stability criteria: it is generally used to check whether the mean value of the target function is stable. The change of the target value in several consecutive steps is small, and a certain number of steps are specified;
(4) temperature rejection function: If the temperature must be decreased according to a certain ratio, the SA algorithm can be used, but the temperature drop is very slow. In fast SA, it is generally used. Currently, it is often used as a constantly changing value.
(5) annealing termination criteria: Generally, the ending temperature is set, the number of iterations is set, the searched optimal value remains unchanged for multiple consecutive times, and the system entropy is verified to be stable.
To ensure a better solution, algorithms usually adopt slow cooling, multi-sampling, and low-end temperature, resulting in a long algorithm running time, this is also the biggest drawback of simulated annealing. People who are drunk and have a hard time doing things, what's more, rabbits?
Genetic Algorithm (GA)
"Survival of the fittest" is the basic idea of evolution. Genetic Algorithms simulate what nature wants to do. Genetic algorithms can be well used for optimization. If we think of it as a highly idealized simulation of natural processes, it will show its elegance-although competition for survival is cruel.
The genetic algorithm takes all individuals in a group as objects and uses randomization technology to guide efficient search for a encoded parameter space. Among them, selection, crossover and mutation constitute the genetic operation of the genetic algorithm; parameter encoding, initial group setting, fitness function design, genetic operation design, and control parameter setting constitute the core content of genetic algorithm. As a new global optimization search algorithm, genetic algorithms are widely used in various fields due to their simple and common features, strong robustness, concurrent processing and high efficiency and practicality, it has achieved good results and has gradually become one of the important intelligent algorithms.
Pseudo code of the genetic algorithm:
Procedure genetic algorithm
Begin
Initialize a group and evaluate the fitness value; (1)
While not convergent (2)
Begin
Select; (3)
If random [0, 1] <pc then
Crossover; (4)
If random (0, 1) <pm then
Mutation; (5)
End;
End
The above procedures have five important links:
(1) encoding and initial population generation: GA first expresses the solution data of the space as the genotype String Structure Data of the genetic space before searching, different combinations of these string structures constitute different points. Then N initial String Structure Data are randomly generated. Each string structure data is called an individual, and N individual constitute a group. GA uses the N string structure data as the initial point to start iteration.
For example, in the traveling salesman problem, you can encode the path that the merchant has traveled or the entire graph matrix. The encoding method depends on how the problem is described to better solve the problem. The initial group should also be selected as appropriate. If the selection is too small, the hybrid advantage is not obvious, and the algorithm performance is very poor (the number of mice that are dominant in evolution is better than that of tigers ), if the group size is too large, the calculation is too large.
(2) check whether the algorithm convergence criteria are met and whether the control algorithm ends. You can determine the adaptability to the optimal solution or determine the number of iterations.
(3) Adaptive value evaluation detection and Selection: adaptive functions indicate the merits and demerits of the individual or solution. adaptability should also be evaluated at the beginning of the program so as to compare with the future. For different problems, the definition of adaptive functions is also different. Choose Based on the adaptability. The purpose of selection is to select excellent individuals from the current group, so that they have the opportunity to serve as the parent for the next generation of Reproductive Child. Genetic Algorithms reflect this idea through the selection process. The principle of selection is that an adaptive individual has a high probability of contributing one or more offspring to the next generation. Choosing to implement Darwin's survival principle of the fittest.
(4) hybridization: Perform hybridization Based on the probability of hybridization (pc. Hybridization is the most important genetic operation in genetic algorithms. A new generation of individuals can be obtained through the Hybrid Operation. New individuals combine the characteristics of their parent individuals. Hybridization reflects the idea of information exchange.
You can select a point to exchange, insert, reverse order, or randomly select several points for hybridization. If the probability of hybridization is too large, the population is updated quickly, but the adaptive individuals are easily drowned, and the search will be stuck if the probability is lower.
(5) mutation: The mutation is performed based on the probability of variation (pm. Mutations first randomly select an individual in the group, and randomly change the value of a string in the string structure data with a certain probability for the selected individual. Like the biological world, the probability of mutation in GA is very low. Mutations provide opportunities for new individuals to generate.
Mutation can prevent the evolutionary stagnation caused by the defect of effective genes. A relatively low mutation probability can make the genes change constantly, and if it is too large, it will fall into a random search. Think about the terrible situation in which every generation in the biological world is very different from the previous generation.
Just like the variations in nature and any species, the Genetic Algorithm for variable encoding does not consider whether the function itself can be imported, whether it is continuous, and so on, so the applicability is strong; and, it starts to operate on a population, which implies parallelism and is easy to find the "global optimal solution ".
Taboo Search Algorithm (Tabu Search, TS)
In order to find the "global optimal solution", you should not stick to a specific region. The disadvantage of local search is that it is too greedy to search for a certain local area and its neighbors, leading to a blind spot. Taboo Search is a part of the local optimal solution that is found, and consciously avoids it (but not completely isolated) to obtain more search areas. When the rabbits find Taishan, one of them will stay here, and the other will look for it elsewhere. In this way, after a large circle, compare the several peaks found, and mount Everest stands out.
When the rabbits look for it again, they usually consciously avoid Taishan because they know that they have already looked for it and a rabbit is watching it. This is the meaning of "TABU list" in Taboo Search. The rabbit who stays in Taishan will not settle in the house. It will return to the army looking for the highest peak in a certain period of time, because there are a lot of new news at this time, after all, Taishan also has a good height and needs to be re-considered. The team's return time is called "Tabu length" in the Taboo Search; if during the search process, the rabbit who left behind Taishan has not yet returned to the team, but all the places found are the lower-level areas in the North China Plain. The rabbits have to consider selecting Taishan again. That is to say, when a place where rabbits are left behind is superior, beyond the "best to far" state, you can consider this place regardless of whether there are rabbits left behind, this is called "aspiration criterion )". These three concepts are the most different between Taboo Search and general search criteria, and the key to algorithm optimization is here.
Pseudo Code expression:
Procedure tabu search;
Begin
Initialize a string VC at random, clear up the TABU list;
Cur: = VC;
Repeat
Select a new string vn in the neighborhood of vc;
If va> best_to_far then {va is a string in the tabu list}
Begin
Cur: = va;
Let va take place of the oldest string in the tabu list;
Best_to_far: = va;
End else
Begin
Cur: = vn;
Let vn take place of the oldest string in the tabu list;
End;
Until (termination-condition );
End;
The above procedures have key points:
(1) taboo object: You can select the current value (cur) as the taboo object and put it into the tabu list. You can also put all the values on the same "contour line" as those of course into the tabu list.
(2) In order to reduce the calculation workload, the set of taboo length and taboo table should not be too large, but the taboo length is too small to be searched cyclically, and the taboo table is too small to fall into "Local excellent solution ".
(3) best_to_far operations in the above section are directly assigned with the optimal "unblocking and unblocking". However, sometimes there will be "deadlocks" that are not greater than best_to_far, and all the "deadlocks" that are disabled by the unblocking, at this time, the best solution in the solution should be lifted to continue.
(4) Termination criteria: similar to simulated annealing and genetic algorithms, commonly used algorithms include: given an iterative step number, and terminating the search when the distance from the estimated optimal solution is smaller than a certain range; when the distance from the optimal solution remains unchanged for several consecutive steps, the search is terminated;
Taboo Search is a simulation of the human thinking process itself. It accepts some poor solutions by avoiding some of the local optimal solutions (or memory), thus jumping out of the local search.
Artificial Neural Network (ANN)
Neural Networks know from their names that they are simulating the human brain. Its neuron structure, its composition and function are imitating the human brain, but it is just a rough imitation, far from reaching the perfect point. Unlike the Von noriman machine, neural network computing is non-digital, non-precise, highly parallel, and provides self-learning functions.
In life sciences, neural cells are generally called neurons, which are the most basic unit of the entire neural structure. Each nerve cell is like an arm, where a palm contains a cell nucleus, called a cell body, and a finger is called a token. It is an input pathway of information, like an arm called an axon, it is the output path of information. neurons are closely connected to each other to transmit signals. The transmitted signals can lead to changes in the neuron potential. Once the potential is higher than a certain value, this will trigger the excitation of neurons, which will pass out electrical signals through the axon.
However, if we want to use computers to imitate biological nerves, artificial neural networks are required to have three elements: (1) Defining artificial neurons in the form; (2) providing the connection method of artificial neurons, or give the network structure; (3) give the definition of signal strength between artificial neurons.
In history, the first artificial neural network model is called the M-P model, which is very simple:
Where, it indicates the state of neuron I at t moment, 1 indicates the excited state, 0 indicates the suppression state, is the connection strength between neuron I and j, and represents the threshold value of neuron I, neurons exceeding this value can be stimulated.
This model is the simplest neuron model. However, this model has been very powerful: The inventor McCulloch and Pitts of this model have proved that it can complete any work of the current digital computer without considering the speed and implementation complexity.
The above M-P model is only a layer of network, if you consider the aspect of a plane, the M-P network can only divide a plane into a half plane, but can not select a specific part. The solution is "multi-layer forward Network ".
Figure 2
Figure 2 shows a multi-layer forward network. The bottom layer is called the input layer, and the top layer is called the output layer. any intermediate layer accepts all input from the previous layer and passes the input to the next layer after processing. There is no connection between neurons in each layer, and there is no direct connection between input and output layers. There is only one-way connection and no feedback. Such a network is called a multi-layer forward network ". After the data is input, it is weighted by each layer and the final result is output.
Figure 3
3. Multi-layer network functions can be illustrated with coverage: A single-layer network can only divide a plane into two parts. A dual-layer network can divide any convex domain, while a multi-layer network can divide any area.
In order for such a network to have a proper weight, the network must be encouraged to learn and adjust it on its own. The basic idea of a method called Back Propagation (BP) is to examine the difference between the final output solution and the ideal solution and adjust the weight, this adjustment starts from the output layer and goes through the middle layer to reach the input layer.
It can be seen that neural networks solve problems through learning. Learning does not change the structure and working mode of a single neuron, and there is no direct connection between the characteristics of a single neuron and the problem to be solved, the role of learning here is to change the intensity of neurons based on the relationship between excitation and inhibition. The information of any sample in the learning sample is included in each weight of the network.
The BP algorithm involves the process of examining the differences between the output solution and the ideal solution. If the difference is w, the purpose of adjusting the weight is to minimize w. This includes the "minimum value" mentioned above. The general BP algorithm uses local search, such as the shortest descent method and the Newton method. If you want to obtain the global optimal solution, you can use simulated annealing and genetic algorithms. When a directed network uses a simulated annealing algorithm as its learning method, it generally becomes a "poltzman network", which is a random neural network.
In the process of learning the BP algorithm, some definite values must be used as the ideal output, which is like the supervision of teachers when middle school students are learning. How can we learn from artificial neural networks without supervision?
Just as a free market introduces competition without macro-control, there is a learning method called "unsupervised and competitive learning ". After competition between several neurons of the input neuron I, only one neuron is 1, and the other is 0. For failed neurons, adjustments move in a direction that is advantageous to the competition, and may eventually win in a competition;
Artificial neural networks also have feedback networks, such as the local network. The signal transmission direction of neurons is bidirectional, and an energy function is introduced to influence each other through neurons, the value of the energy function keeps decreasing. At last, we can provide a low-energy solution. This idea is similar to simulated annealing.
When an artificial neural network is applied to an algorithm, its accuracy and speed are not closely related to the implementation of the software. The key is its continuous learning. This idea is already quite different from the noriman model.
Summary
Simulated Annealing, genetic algorithms, taboo search, and neural networks have unique advantages in solving the global optimal solution. They share a common feature: they all simulate natural processes. The concept of simulated annealing is derived from the annealing process of solid matter in physics. The genetic algorithm draws on the evolutionary idea of survival of the fittest in nature, and the Taboo Search simulates the mental process of human memory, neural Networks Directly simulate the human brain.
The relationship between them is also very close. For example, simulated annealing and Genetic Algorithms provide ideas for neural networks to provide better learning algorithms. Combine them organically to learn from each other and improve performance.
These intelligent algorithms are different from ordinary programs that perform precise Computing Based on Turing machines, especially artificial neural networks. They are a new interpretation of computer models and jump out of the circle of von norimann, the computer designed according to this idea has broad development prospects.