A reference to the artificial neural network should think of three basic knowledge points: One is the neuron model, the other is the neural network structure, and the third is the learning algorithm. There are many kinds of neural networks, but the classification basis cannot escape from the above basic knowledge points. So in learning if you can just hold the above three clues, you can comprehend by analogy, have a very good view of the neural network. Today, we summarize the three basic knowledge points to guide neural network learning and deepen the understanding of neural networks.
I. Neuron model
Artificial neural Network (ANN) is a computational structure, which is based on modern neurobiology research, which simulates the biological process and reflects some characteristics of human brain. It is not a true depiction of the nervous system of the human brain, but merely some abstraction, simplification, and simulation of it. According to the previous introduction to the biological neural network, the neurons and their synapses are the basic devices of the neural network. Therefore, simulating biological neural networks should first simulate biological neurons. In artificial neural networks, neurons are often referred to as "processing units". Sometimes it is called "node" from the viewpoint of the network. Artificial neuron is a formal description of biological neurons, which abstracts the process of information processing of biological neurons, describes it with mathematical language, simulates the structure and function of biological neurons, and expresses them with model diagrams.
1. Neuron Modeling
At present, many neuron models have been proposed, among which the earliest and most influential are the 1943 psychologist McCulloch and mathematician W. Based on the analysis and summarization of the basic characteristics of neurons, the Pitts m-p model is presented firstly. After continuous improvement, the model is widely used in the form of neuron model. With regard to the information processing mechanism of neurons, this model proposes the following 6-point assumptions on the basis of simplification:
(1) Each neuron is a multi-input single output information processing unit;
(2) Neuron input is divided into two types: excitatory input and inhibitory input.
(3) Neurons have spatial integration characteristics and threshold characteristics;
(4) There is a fixed time lag between neuron input and output, mainly depends on synaptic delay;
(5) Ignoring time integration and non-period;
(6) The neuron itself is non-time-varying, i.e. its synaptic delay and synaptic strength are constant.
Based on the above 6 assumptions, the neuron modeling diagram is as follows:
In fact, expression is a mathematical model of integrating ∑ and mapping F.
2. Mathematical model of neurons
By the neuron model, the mathematical model of the neuron can be obtained as follows:
If the threshold parameter is integrated into the ∑ as the input and w0 weights, the following simplified form can be obtained:
3. Activation function of neurons (map f)
The main difference between the different mathematical models of neurons is that different transformation functions are used, so that the neurons have different information processing characteristics. The information processing characteristic of neuron is one of the three factors that determine the whole performance of artificial neural network, so the research of transformation function is of great significance. The transformation function of neurons reflects the relationship between the neuron output and its activation state, the most commonly used transformation function has the following 4 forms.
(1) Threshold type transformation function
using the Unit step response function, the formula and the diagram are as follows:
(2) Nonlinear transformation functionThe nonlinear transformation function is a non-decrement continuous function of the real field R to [0, 1] closed set, which represents the state continuous neuron model. The most commonly used nonlinear transformation function is a unipolar Sigmoid function curve, called S-type function, which is characterized by the function itself and its derivative are continuous, so it is very convenient to deal with. Unipolar and Bipolar S-type function formulas and illustrations are as follows:
(3) Piecewise linear transformation FunctionThe function is characterized in that the input and output of the neuron satisfies the linear relationship within a certain interval. Because of the characteristic of piecewise linearity, it is relatively simple to implement. This kind of function is also called pseudo-linear function, and the G formula and diagram of the unipolar piecewise linear transformation function are as follows:
(4) Probability-type transformation function
the relationship between the input and output of a neuron model with a probabilistic transformation function is uncertain, and a random function is required to describe the probability that its output state is 1 or 0. The probability of setting the neuron output to 1 is as follows:
in the formula, T is called the temperature parameter. This neuron model is also called a thermodynamic model, since the output state distribution of neurons using this transformation function is similar to the Boltzmann (Boltzmann) distribution in thermodynamics.
ii. Neural network structure
A large number of neurons make up a large neural network, so that the processing and storage of complex information can be realized, and various superior characteristics are presented. The powerful function of neural network is closely related to its large-scale parallel interconnection, nonlinear processing and the plasticity of the interconnection structure. Therefore, the neurons must be connected to neural networks according to certain rules, and the connection rights of each neuron in the network will change according to certain rules. The biological neural network is connected by hundreds of millions of biological neurons, and the artificial neural network is limited to the difficulty of physical realization, and it is a network which is composed by a relatively small number of neurons according to certain laws for simple calculation.
Envelope The neurons in artificial neural network are often called nodes or processing units, each node has the same structure, and its action is synchronized in both time and space.
There are many models of artificial neural networks, which can be classified according to different methods. The two common classification methods are classified by the topology of the network connection and the information flow within the network.
1. Network topology type
The connections between neurons are different, and the topological structure of the network is different. According to the connection between neurons, neural network structure can be divided into two major categories.
(1) Level type
A neural network with hierarchical structure divides neurons into layers, such as input layer, middle layer (also known as hidden layer), and output layer, each layer is connected in sequence, as shown in
The neurons in the input layer are responsible for receiving input information from the outside and passing it to the middle hidden layer neurons; the hidden layer is the internal information processing layer of the neural network, which is responsible for the data transformation, and the hidden layer can be designed as one layer or multilayer according to the need of information transformation ability. The last layer of information passed to each neuron in the output layer is processed by the data processing, and the output layer is output to the outside world (such as actuator or display device). The hierarchical network structure has 3 kinds of typical combination ways.
(A) Simple hierarchical network structure
As shown in the structure, the neurons are arranged hierarchically, and each layer of neurons receives input from the previous layer and outputs to the next layer, and there is no connection pathway between the neurons themselves and the neurons in the layer.
(B) Hierarchical network structure with connections to the input layer of the output layer
:
The input layer to the output layer has a hierarchical network structure with connection paths. The input layer neuron can receive both input and information processing function.
(C) hierarchical network structure with interconnected layers
:
The neurons in the same layer have interconnected hierarchical network structure, which is characterized by introducing the lateral action between neurons in the same layer, so that the number of neurons can be controlled simultaneously, so as to realize the self-organization of each layer neuron.
(2) Interconnect type structure
For the interconnection network structure, there may be a connection path between any two nodes in the network, so the interconnection network structure can be subdivided into 3 cases according to the interconnection degree of the nodes in the network.
(A) fully interconnected typeeach node in the network is connected to all other nodes, as shown in:
(B) Local interconnect typeEach node in the network is connected only to its neighboring nodes, as shown in:
(C) Sparse connection type nodes in the network are connected to only a few nodes that are far apart.
2. Network Information flow type It can be divided into two types according to the transmission direction of internal information of neural network.
(1) Feed-forward network
Feedforward is due to the direction of network information processing from the input layer to the hidden layer to the output layer by layer of the name. From the view of information processing ability, the nodes in the network can be divided into two kinds: one is the input node, it is only responsible for transmitting the information from outside to the first hidden layer, and the other is the node with processing ability, including the hidden layer and the output layer node. The output of a layer in the Feedforward network is the input of the next layer, and the information processing has the directionality of the level-by-layer transmission, and there is no feedback loop in general. So this kind of network is easy to series up to establish multilayer feedforward network. Multilayer Feedforward networks can be represented by a graph with a direction-free loop. Where the input layer is often remembered as the first layer of the network, the first hidden layer is recorded as the second layer of the network, the rest of the analogy. Therefore, when referring to a network with a single-layer computing neuron, refers to a two-layer feedforward network (input layer and output layer), when referring to a network with a single hidden layer, refers to a three-layer feedforward network (input layer, hidden layer and output layer).
(2) Feedback type network
All the nodes in the feedback network have the information processing function, and each node can receive input from the outside, and also can output to the outside world. A simple fully interconnected fabric network is a typical feedback network that can be represented by a complete, non-oriented graph as shown:
The topological structure of neural networks is the second most important factor to determine the characteristics of neural networks, which can be summarized into distributed memory and distributed information processing, high interconnection, high degree of parallelism and structural plasticity.
third, neural network learning
The functional characteristics of artificial neural networks are determined by the topological structure of their connections and the strength of synaptic connections, that is, the connection weights. The total connection weights of the neural network can be expressed as a matrix W, which reflects the knowledge storage of the neural network for the problem solved. The neural network can continuously change the connection weights and topology of the network through the learning training of the samples, so that the output of the network is close to the desired output. This process is called neural network learning or training, and its essence is the dynamic adjustment of variable weights. The learning mode of neural network is the third key factor to determine the performance of neural network information processing, so the study of learning has an important position in neural network research. The rules of changing weights are called learning Rules or learning algorithms (also known as training rules or training algorithms), and the algorithm is simple in a single processing unit hierarchy, regardless of which learning rules are used. However, when a large number of processing units in the collective weight adjustment, the network presents a "smart" feature, where meaningful information is stored in the adjusted weight matrix. Neural network learning algorithms are many, according to a widely used classification method, neural network learning algorithm can be summed up into 3 classes. One kind is to have the tutor to study, one kind is without the tutor to study, also has the type is the indoctrination study. There is a tutor learning also known as supervised learning, this learning mode is based on error correction rules. In the course of learning and training, we need to provide the network with an input mode and a correct output mode, called "Teacher Signal". The actual output of the neural network is compared with the expected output, when the output of the network is inconsistent with the expected teacher signal, the weights are adjusted according to the direction and size of the error, so that the output of the next network is closer to the desired result. For a tutor to learn, the network in the ability to carry out work tasks must be learned before, when the network for a variety of input can produce the desired output, that is, the network has been trained under the guidance of the instructor "learned" the training data set contains the knowledge and rules, can be used to work.
Non-tutor learning is also known as unsupervised learning, the learning process, the need to continuously provide dynamic input information to the network, the network can be based on the unique internal structure and learning rules, in the input information flow to discover any possible patterns and rules
Law, and can adjust the weights according to the function of the network and the input information, this process is called the self-organization of the network, and the result is that the network can classify the patterns that belong to the same class automatically. In this learning mode, the weight adjustment of the network does not depend on the influence of the foreign teachers ' signals, and it can be considered that the network learning Evaluation standard is implicit in the internal network. In the study of tutors, the more external guidance information provided to neural network learning, the more the neural network learns and grasps
The more knowledge, the greater the ability to solve problems. However, sometimes there is little or no priori information about the problems solved by neural networks, and it is more practical to study without tutors in this case. Inculcation learning refers to the design of the network as a special example of memory, and later, when given the input information about the example, the example is recalled. The weights of the network are not formed by training, but by some design method. Once the weight is designed, the one-time "indoctrination" to the neural network no longer changes, so the network of the weight of the "learning" is "rote", rather than training-style. There are two stages of training stage and work in the operation of tutor learning and non-tutor learning network. The purpose of training learning is to extract the hidden knowledge and laws from the training data and store it in the network for the work stage.
It can be considered that a neuron is an adaptive unit whose weights can be adjusted according to the input signal it receives, its output signal, and the corresponding supervisory signal. The weight of the Wj is proportional to the product of the ΔWJ (t) of the t moment and the input vector X (t) and the learning signal R. The mathematical representation and its diagram are as follows:
Common learning rules are summarized in the following table:
Summary
Because I did not do a good job of learning the neural network in accordance with a certain order of learning, but in accordance with the needs of the chapters to learn, so always anxious. To the original most important part of the basic is not mastered directly to learn the new network structure and new models, which leads to low learning efficiency, until in the study encountered a bottleneck, just back to look at the Han Liqun Teacher's "Artificial Neural network Tutorial" in the Artificial Neural network basic section, The basic content is as described below. After reading this chapter, I felt that there seemed to be a sketchy effect, and looked back to those multilayer perceptron, BP algorithm, Hopfield network, simulated annealing, Boltzmann machine, self-organizing competition neural network, will feel a sort of collation effect. So, through this blog post to summarize, but also hope to help more people can grasp the neural network of the most basic three: neuron model, neural network structure and neural network learning. Really do comprehend by analogy, both can go into, can also stand out to see the problem.
************************
2015-8-13
Less art
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Artificial neural network basic concept, principle knowledge (complement)