Deep Learning (DL) and convolutional Neural Network (CNN) learning notes essay -01-CNN Basics points

Last Update:2015-08-11 Source: Internet

Author: User

Tags theano

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The first day of CNN Basics From:convolutional Neural Networks (LeNet)

neuro-Cognitive machines .
The source of CNN's inspiration has been very comprehensive in many papers, and it is the great creature that found receptive Field (the sensation of wild cells). Based on this concept, a neuro-cognitive machine is proposed. Its main function is to recept part of the image information (or characteristics), and then through the hierarchical submission of the connection, the various local features combined into the entire image features. Papers that need to be read carefully include:
(1) the first essay on the function of feeling wild receptive fields and functional architecture of Monkey striate cortex,1968
(2) The Neocognitron A self-organizing neural network model for A mechanism of pattern recognition unaffected By shift in position,1980
(3) HMAX Robust object recog-nition with Cortex-like mechanisms,2007
(4) very important LeNet-5 gradient-based Learning applied to document RECOGNITION,1998
Sparse Connection .
The connection between the CNN layer and the layer is not all connected, but the local connection, it is the role of a significant reduction of parameters.

Figure 1 Inter-layer connections
share the weight value .
The filter (filter) is a neuron corresponding to each small circle in the first layer of Figure 2, their weight and bias are the same (that is, sharing the same parameters), and then each neuron and the input image data convolution, the M layer – feature layer (feature map, Each layer can contain more than one feature map, and this feature map contains 3 hidden nodes (hidden units). (The above is my personal understanding, if there are errors, welcome to point out, beginners, forgive me). The gradient descent method can still be used to train the parameters they share, but small changes need to be made based on the original algorithm. The gradient of the weight is simply the summation of the gradient of the parameters in each weight.

Figure 2 Weight Sharing graph
annotation Description .
Defines the K feature map for a layer in HK.
The Formula Hkij=tanh ((wk∗x) IJ+BK) is used to calculate HK.
For richer representations of data characteristics, each hidden layer contains multiple feature graphs. Consider the example:

Figure 3 For example, the first layer contains 4 feature maps
The CNN in Figure 3 contains 2 layers, the m−1 layer consists of 4 feature graphs, the M layer contains 2 feature graphs, and each pixel value in H0 and h1,h0 and H1 is calculated by the 2∗2 of each feature graph in the previous layer. Among them, the weights of H0 and H1 W0 and W1 are a 3-dimensional tensor (which may be equivalent to vectors, for interpretation), the first dimension represents the subscript of the previous layer, and the latter two dimensions represent the coordinates of the feature map. Together, Wklij indicates that the weights of each pixel of the M-layer K feature map are connected to the weights of the pixel points of the m−1 (I,J) of the L feature map.
convolution Operation .
With the famous Python library Theano. Convop. How to use see the original page, speak very detailed.
maximum pooling .
Another very heavy concept in CNN is the maximum pooling . It is a nonlinear down-sampling (a personal understanding of its function similar to the dimensionality reduction) method. Maximum pooling divides the input image into overlapping image matrix blocks, and each sub-region outputs its maximum value. The two reasons why the maximum pooling method is very effective in the visual processing problem are:
(1) Reduce the computational complexity of the upper level by reducing the non-maximum value.
(2) The result of pooling supports translation invariance. In the convolution layer, each pixel point has 8 orientations that can be panned. When the maximum pooling window is a 2∗2 area, there are 3 directions that produce the same result, and 5 directions produce the same result when the largest pooled window is the 3∗3 area.
Examples of applications in Theano are not introduced.
at this point, the basic knowledge of CNN is complete . Next section to learn a complete CNN model –LeNet-5
Resources
(1) http://deeplearning.net/tutorial/lenet.html
(2) deep Learning (depth learning) Learning Notes finishing Series (vii)
(3) deep learning and Theano official Chinese Course (translator) (iv)--convolutional neural Network (CNN)

Copyright NOTICE: This article for Bo Master original article, the level is limited, welcome to criticize correct. Welcome reprint, but please indicate the source of reprint, thank you. qq:371073260

Deep Learning (DL) and convolutional Neural Network (CNN) learning notes essay -01-CNN Basics points

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More