How to understand weight sharing in convolutional neural networks

Source: Internet
Author: User
Tags advantage

Weight sharing the word was first introduced by the LENET5 model, in 1998, LeCun released the Lenet network architecture, which is the following:

Although most of the talk now is that the 2012 Alexnet network is the beginning of deep learning, the beginning of CNN can be traced back to the LENET5 model, and its features are widely used in the study of convolutional neural networks in the early 2010--one of which is the sharing of weights. .

In fact, the word "share" is said to be the whole picture in the use of the same convolution core parameters, such as a 3*3*1 convolution kernel, the convolution core of 9 parameters are shared by the entire graph, and not because of the image within the location of the difference in the convolution kernel to change the weight coefficient. To be more straightforward, is to use a convolution core does not change the internal weight coefficient of the convolution processing the entire picture (of course, CNN in each layer will not have only one convolution core, this is only to facilitate interpretation).

Yes, that is, a simple operation, so to say, in fact, the image processing of similar edge detection, filtering operations and so on are doing global sharing, then why then to take this idea to explain, and then gave it a name.
(The following section is personal understanding, if there is a wrong place, also look correct.) )
Most of us find this problem very simple in hindsight, but only the great God can be a pioneer. Lenet the idea of convolution into the neural network model for the first time, which is a pioneering work, and before that, the neural network input is extracted to the characteristics of, for example, want to make a house price prediction, we choose the housing area, the number of bedrooms and so on data as a feature. When the convolution core is introduced into the neural network to deal with the picture, there is a natural problem, what is the input of the neural network. If it is a pixel value on a pixel, it means that each pixel value corresponds to a weight factor, which brings two questions:
1. Each layer will have a large number of parameters
2. There is no difference between the pixel value as an input feature and the traditional neural network, and it does not take advantage of the local correlation in the image space.

The weight-sharing convolution operation effectively solves this problem, regardless of the size of the image, you can choose a fixed size convolution kernel, lenet the largest convolution core only 5*5*1, and in alexnet the largest convolution nucleus is just 11*11*3. Convolution operations ensure that each pixel has a weight factor, but these coefficients are shared by the entire picture, greatly reducing the volume of the parameters in the convolution core. In addition, convolutional operations take advantage of local correlations in the image space, which is one of the biggest differences between CNN and traditional neural networks or machine learning, and features are automatically extracted.
This is why convolutional layers tend to have multiple convolution cores (even dozens of, hundreds), since weight sharing means that each convolution core can only be extracted to a single feature, in order to increase the ability of CNN to express, of course, a number of cores, unfortunately, it is a hyper-parameter.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.