Artificial neural Network (ANN) is a mathematical model for information processing, which is similar to the structure of synaptic connection in the brain, in which a large number of nodes (or neurons) are connected to form a network, that is, "neural network", in order to achieve the purpose of processing messages, neural networks usually need to be trained, Training process is the process of network learning, training changes the network node connection weights, is its classification function, trained network can be used for object recognition.
At present, there are hundreds of different models of neural network, such as BP Network, RBF Neural Network, Hopfield Network, stochastic neural network, competitive nerve network, etc., but the current neural network still has some shortcomings such as slow convergence rate, large computational capacity, long training time and non-explanation.
Deep learning and neural networks
Establish, simulate the human brain to analyze and study the neural network, imitate the human brain mechanism to interpret the data;
A higher-level representation of an attribute class or feature by combining lower-layer features to reveal a distributed feature representation of the data
Neural networks have been the cause of the crash:
Easy to fit, parameters difficult to adjust the need for skills;
Training speed is slow;
Less layer (<=3), the effect is not better than other methods;
Deep Learning Framework:
The neural network is similar to the layered structure, and the system consists of a multilayer network consisting of input error, hidden layer and output.
Only adjacent nodes are connected, the same layer and the cross-layer nodes are not connected directly to each other;
Deep Learning Training Process:
Two steps:
Training a layer of network
Tuning so that the original representation of x is generated by the advanced representation of R and the advanced representation of R down-generated X ' As always (change what I want to be the same as now, change the reality and I think the same);
Training process:
Use self-rising non-supervised learning (starting from the bottom, one layer to the top level training);
Top-down supervised learning (through tagged data training, training from top to bottom transmission, coordination with the network)
Depth Confidence Network DBN (deep belief networks) unsupervised
Deep learning, a learning approach based on multilayer neural network architecture
Make deep learning and semi-supervised unsupervised methods combine to improve performance;
Deep learning refers to the training of deep structures and the adjustment of parameters within them to complete machine learning tasks.
The depth architecture is accomplished by many layers of nonlinear operations, and many hidden layers of neural networks;
BP algorithm
Iterative algorithm
Randomly set the initial value, calculate the current network output, according to the current output and lable direct difference to change the parameters of the front layers until convergence;
Problem
The gradient is more and more sparse, from the top layer downward, the error correction signal older and smaller;
Convergence to local optimality, especially when starting from the optimal region (random value initialization);
Generally only with tagged data training, but most of the data is not labeled;
KNN Full Supervision
K-nearest Neighbors algorithm, find and unknown sample x distance method, it stores the sample, to see which of the K samples most belong to the class, the K neighbor method is a lazy learning method, it stores the sample until it needs to classify the classification, if the sample set is more complex, can result in significant computational overhead and cannot be applied to situations where real-time is strong.
Advantages: Simple and effective, the cost of repetitive training is lower, the computational time and space linearity and the scale of training set are more suitable for automatic classification of class domains with larger sample size;
Disadvantages: Lazy Learning methods (lazy learning methods are not learning), the output is not strong understanding;
There is a significant deficiency in the classification, when the sample is unbalanced, such as a class of sample capacity is very large, and other class sample capacity is very small, it is possible that when a new sample is entered, the sample of the K-Neighbor large-capacity class of the majority, the algorithm only calculates the "nearest" neighbor sample, when a class of a large number Then or the sample is not close to the target sample, the quantity does not affect the running result (can adopt the method of weight value to improve);
Large computational capacity (common solution, pre-editing of the known sample points, prior to remove the small sample of the role of the classification);
Support Vector Machine
Can solve the problem of machine learning under small sample, can improve generalization ability, can solve the problem of high dimension, can solve nonlinear problem, can avoid neural network structure selection and local minimum point problem;
Cons: Sensitive to missing data, no general solution for nonlinear problems, must be carefully selected kernel function processing
On the basis of two yuan, multi-class classifier is constructed based on the combination principle of multi-classification.
Single-to-one classification ovo, each of the two categories between each other, voting the most belong to the class; there is no subregion
One-to-many classification OVR.
Summarize
Machine learning is essentially a classification algorithm, including SVM classification, BP Neural Network, KNN, etc.
First initialize the weights (by random means), give the actual input ideal output, as well as activation function (SVM kernel function), by establishing a network (BP algorithm), set multi-level, continuous correction adjustment (BP algorithm including feedback method) to change the weight value, training a set of weights within the allowable range of error, The current training data is better, and the input detection is also given.
Machine learning, a learning process, through algorithms and programs to enable the machine to output a good result of the input training data, and then through the test data to determine the learning effect, the BP algorithm also includes generalization (for new test data detection) and recall (training data detection).