Status of machine learning:
1, China's traditional industry is not ready to use artificial intelligence technology, many traditional industries do not regard it as a strategic focus;
2, to set up artificial intelligence strategy of enterprises, the lack of talent is its main shackles;
3, in this field, especially in the robotics level with developed countries far apart.
We make effective advances because we have accumulated a lot of experience, and through the use of experience can make effective decisions on new situations.
I. Basic CONCEPTS
Machine learning is a study that simulates human learning activities, acquires new knowledge and new skills, and identifies existing knowledge. The machine here is the computer. -Excerpted from Zai Zixing's "principles, algorithms and applications of intelligent systems" p150.
The data used for learning must be labeled well, before learning usually divides all the data into three parts: the training set, the validation set and the test set, where the training set is used for learning parameters, the validation set is used to adjust the set parameters, and the test set is used to evaluate the learning effect.
Second, the basic model
2.1 Probability distribution of a piece
Event A is the probability of occurrence under the condition that another event B has already occurred. If only two events a,b.
The probability of each pixel distribution is large, through a large number of data training to achieve the goal.
2.2 Bayes Formula
Anyone who has studied probability theory knows the formula for conditional probabilities: P (AB) =p (A) p (b| A) =p (B) P (a| b) The probability that event A and event B occur at the same time equals the probability of a B occurring under the condition of a. The Bayesian formula is deduced from the conditional probability formula: P (b| A) =p (a| b) p (b)/q (A); that is, the known P (a| b), P (A) and P (b) can calculate the P (b| A).
Suppose B is a probabilistic space {b1,b2,...bn} consisting of mutually independent events. Then P (a) can be expanded using a full probability formula: P (a) =p (a| B1) P (B1) +p (a| B2) P (B2) +. P (a| bn) P (BN). The Bayesian formula is expressed as: P (bi| A) =p (a| BI) P (BI)/(P (a| B1) P (B1) +p (a| B2) P (B2) +. P (a| bn) (BN)); P (bi|) often A) is called a posteriori probability, and P (a| bn) P (BN) is a priori probability. and P (Bi) is also called the base probability.
The Bayesian formula looks simple, but it is widely used in the field of natural science. At the same time, the theory itself contains profound thoughts.
2.3 Full Probability formula
The full probability formula is an important formula in probability theory, which transforms the probability solution problem of a complex event A to the summation problem of the probability of simple events occurring in different situations. Content: If the event B1, B2, B3 ... Bn constitutes a complete set of events, that is, they are 22 incompatible, and they are complete, and P (Bi) is greater than 0, then P (a) =p for either event a (a| B1) P (B1) + p (a| B2) P (B2) + ... + p (a| bn) P (BN).
3. Expert System
In the expert system, the core problem is the representation, acquisition and application of knowledge, knowledge acquisition is determined by various machine learning strategies, and the classics include inductive learning, teaching learning, case study, Epiphany Learning and nonclassical software calculation methods. The main difference between expert system and general application lies in that expert systems will form a knowledge base of problem solving in application domain independently, and can be updated, deleted and perfected at any time.
The main functions of expert system should include: Knowledge reserve, descriptive ability, reasoning ability, problem explanation, learning ability and interactive ability.
3.1 Neural network expert system
The most common artificial neural network is composed of three-layer units, one layer is the input unit, which is connected with the hidden unit layer, and finally the output unit layer. The steps to construct an artificial neural network for a particular task are as follows: 1, the choice of the appropriate problem expression, so that the output of the unit and the solution of the problem corresponding to each other; 2, a kind of energy function is constructed, which makes the minimum value correspond to the optimal solution of the problem; 3. Determination of appropriate connection weights and error criteria by energy function 4, through a certain learning strategy to dynamically adjust the weights and errors and other parameters, so that the final form of artificial neural network is a good solution to a given problem model.
A specific task: first, to provide some training instance data to the neural network, the weights matrix of the neural network will be determined to a set of optimal values, then the training task can be accomplished well by constantly judging the conformance of output and expected output and correcting the feedback in time.
3.2 Evolutionary Nervous system
Genetic evolution method is an adaptive global optimization probabilistic search method, which simulates the genetic and evolutionary process of organisms in natural environment. Includes two mechanisms: 1, heredity and mutation, 2, selection and evolution. Selection, crossover, mutation.
For a neural network in a given neural expert system, it is possible to optimize the genetic algorithm in two aspects of weight matrix and network structure. You can do this by following these steps: 1, first, the weight of the matrix of genetic coding, a group of weights encoded as a chromosome, where the same neuron input source can be bundled with genetic optimization, give the initialization weight matrix assignment, 2, define a reflection of the evaluation of chromosomal performance fitness function, Its value corresponds to the performance of the neural network: the reciprocal of the sum of the errors squared. Then, for a given chromosome, each weight contained therein is assigned to the connection edge of the new neural network. Test the network with the example training set, and compute the error squared sum so that the smaller the chromosome, the more adaptive the chromosomes are. Genetic algorithm is to find the smallest square error chromosome. 3, select the genetic operator, that is, select specific crossover and mutation strategy, and the whole summary of the chromosome to operate. 4, the definition of group size and parameters, that is, the maximum number of neural networks represented by different weights matrices, and the parameters of crossover and mutation probability and maximum iteration are defined.
4 Neural Network Learning
Taking the reverse propagation network and Hopfield network as an example, this paper discusses the learning problem by training neural network.
4.1 Learning BP based on reverse communication network
The reverse propagation algorithm is a simple method to calculate the change of the single weight to cause the network performance. Since the process of BP algorithm is to propagate from the output node to the first hidden layer, the weight value caused by the total error is corrected. The learning process of BP algorithm consists of forward propagation and reverse propagation. The process of forward propagation: input information is processed from the input layer to the output layer by layers of the hidden unit layer. If the output layer is not expected to output, then the reverse propagation, the error signal along the original connection path back, and by modifying the weights of each layer of neurons, so that the error signal is minimal.
Belong to a typical feedforward network.
4.2 Learning based on Hopfield network
is a dynamic feedback system