Overview origins
The Convolutional network was originally designed by the visual neural mechanism, a multilayer perceptron designed to recognize two-dimensional shapes, which has a high degree of invariance in translation, scaling, skewing, or deformation of the common form.
1962 Hubel and Wiesel through the study of the visual cortex cells of cats, the concept of receptive field was presented, and the 1984 Japanese scholar Fukushima a neuro-cognitive machine (Neocognitron) model based on the concept of sensation wild, It can be regarded as the first realization network of convolutional neural network, and it is the first application of the concept of sensation field in artificial neural network.
The neuro-cognitive machine decomposes a visual pattern into many sub-patterns (features) and then goes into a hierarchical, connected feature plane, which attempts to model the visual system so that it can be identified even when the object is displaced or slightly deformed. The neuro-cognitive machine can learn from the excitation mode using the displacement constant ability, and can recognize the change shape of these patterns. In the subsequent application study, Fukushima Neural cognitive machine is mainly used for handwritten numeral recognition. Subsequently, researchers at home and abroad proposed a variety of convolutional neural network forms, in the ZIP code identification (Y. LeCun etc), license plate recognition and face recognition have been widely used.
Characteristics
convolutional Neural Network is a special deep-seated neural network model, its particularity is embodied in two aspects, on the one hand, the connection between its neurons is non-fully connected, on the other hand, the weights of the connections between some neurons in the same layer are shared (i.e. the same). Its non-full-connection and weighted-value-sharing network structure makes it more similar to a biological neural network, reducing the complexity of the network model (which is important for deep structures that are difficult to learn) and reducing the number of weights.
Local connections
Think back to the BP neural network. BP network each layer node is a linear one-dimensional arrangement state, and the layer is fully connected to the network nodes of the layer. This assumes that if the node connection between the middle and layer of the BP network is no longer fully connected, it is locally connected. This is the simplest one-dimensional convolutional network. If we extend this idea to two dimensions, this is the convolutional neural network we see in most of the resources. See details:
Left: Fully connected network. If we have an image of 1000x1000 pixels, there are 1 million hidden-layer neurons, each of which is connected to every pixel of the image, and there are 1000x1000x1000000=. 10 12 A connection, i.e. ten Weight parameters.
Right: A locally connected network, where each node is connected to a window of the same position attachment 10x10 as the upper node, then + x A hidden-layer neuron is only 100w times 100, ten 8 A parameter. The number of weight connections is reduced by four orders of magnitude compared to the original value.
We can easily calculate the output of a network node according to the forward transfer process of BP network signal. For example, for a net input that is labeled as a red node, it is equal to the sum of the product of the weight of the previous neuron node value and the red line that are connected to the red wire. This process of calculation, many books are called convolution.
In fact, for digital filtering, the coefficients of their filters are usually symmetrical. Otherwise, the convolution calculation needs to be reversed in half, then multiply and accumulate. Does the above neural network weights satisfy symmetry? I think the answer is no! Therefore, the above-mentioned is a convolution operation, obviously biased. But it doesn't matter, it's just a noun title. Just, the signal processing people, in the first contact with convolutional neural network, brought some misunderstanding of understanding.
Weight sharing
convolutional neural network Another feature is weight sharing. For example, on the right-hand side of the graph, the weights are shared, which means that all red line labels have the same connection weights. This makes it easy for beginners to misunderstand.
Described above is only a single-layer network structure, the former a&t Shannon Lab Yann LeCun and other people based on the convolutional neural network of a word recognition system LeNet-5. The system was used in the 90 's to identify bank handwritten numerals.
The structure of CNN
Just mentioned the characteristics of convolutional networks, such as local invariance: translation, proportional scaling, tilt or the deformation of the common form has a high degree of invariance and so on. These characteristics are learned by the network in a supervised manner.
This network structure is specially designed for recognizing the two-dimensional shape of a multilayer perceptron, mainly with sparse connection and weight sharing two features, including the following forms of constraints:
1, feature extraction. each neuron receives a synapse from the local receptive domain of the upper layer, forcing it to extract local features. Once a feature is extracted, its exact position becomes less important as long as its position relative to other features is preserved approximately.
2, feature mapping. each computing layer of a network is composed of multiple feature mappings, each of which is in planar form. The individual neurons in the plane share the same synaptic value set under the constraint, which has the following beneficial effects: A. Translation invariance. B. Reduction in the number of free parameters (implemented through weight sharing).
3, sub-sampling. each convolution layer is followed by a computational layer that implements local averaging and sub-sampling, and the resolution of the feature map is reduced. This operation has the effect of reducing the sensitivity of the output of the feature map to translation and other forms of deformation.
convolutional Neural Network is a multilayer neural network, each layer is composed of several two-dimensional planes, and each plane consists of several independent neurons.
Deep Learning Learning convolutional neural Network (CNN)