The role of 61x1 convolution nucleus? (with examples) _

The role of 61x1 convolution nucleus? (with examples) __ Depth study

Last Update:2018-08-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Table of Contents: part I: Source partial II: Applications, role III: effects (dimensionality reduction, ascending dimension, trans-channel interaction, increasing of nonlinearity)--from the perspective of fully-connected layers

First, Source: [1312.4400] Network in Network (if 1x1 convolution is followed by a normal convolution layer, the network in network structure can be implemented with the activation function.) ）

second, the application: The residual module in the inception and ResNet in Googlenet

third, the role:

1, dimensionality reduction (reduce parameters)

example of the 3a module in 1:googlenet

The input feature map is 28x28x192

1x1 convolution channel is 64

3x3 Convolution channel is 128

5x5 convolution channel is 32

Left Tou kernel parameters: 192x (1x1x64) +192x (3x3x128) + 192x (5x5x32) = 387072

The right-hand graph adds a 1x1 convolution layer with a channel number of 96 and 16 respectively before the 3x3 and 5x5 convolution layers, so the convolution kernel parameters become:

192X (1x1x64) + (192x1x1x96+ 96x3x3x128) + (192x1x1x16+16x5x5x32) = 157184

At the same time, after adding 1x1 convolution layer behind the parallel pooling layer, the output feature map number can be reduced (feature map size refers to W, H is the share weight sliding window,feature map number is channels)

Left Figure feature Map number: 128 + + + (pooling feature map unchanged) = 416 (if each module is the case, the network output will become larger)

Right Figure feature Map number: 128 + + + (pooling followed by a channel of 32 1x1 convolution) = 256

Googlenet using 1x1 convolution dimensionality reduction, the more compact network structure, although there are 22 layers, but the number of parameters is only 8 layers of alexnet one-twelveth (of course, a large part of the reason is to remove the full connection layer)

the residual module in example 2:resnet

Suppose the feature map on the previous layer is w*h*256, and the final output is 256 feature map

Left-Hand operand: w*h*256*3*3*256 =589824*w*h

Right-hand operand: w*h*256*1*1*64 + w*h*64*3*3*64 +w*h*64*1*1*256 = 69632*w*h, the left parameter is about 8.5 times times the right side. (Achieve dimension reduction, reduce parameters)

2, Ascending dimension (using the least parameters to broaden the network channal)

Example: in the previous example, not only is there a 1*1 convolution kernel at the input, there is also a convolution kernel at the output, and the channel of the 3*3,64 convolution core is 64, just add a 1*1,256 convolution kernel, use only the 64* 256 parameters can widen the network channel from 64 to four times times to 256.

3. Cross-channel information interaction (Channal transform)

Example: Using 1*1 convolution kernel, the operation of descending and ascending dimension is actually the linear combination of information between channel, and the 3*3,64channels of the convolution kernel adds a 1*1,28channels convolution nucleus, which becomes the 3*3,28channels convolution nucleus. The original 64 channels can be understood as a trans-channel linear combination into 28channels, which is the information interaction between channels.

Note: Only linear combinations are made on channel dimensions, W and h are sliding windows with shared weights

4. Increase the non-linear characteristic

1*1 convolution kernel can greatly increase the non-linear characteristics (using the Non-linear activation function of the latter) to keep the feature map scale unchanged (that is, no loss resolution), and make the network deep.

Note: After a filter corresponds to the convolution to get a feature map, different filter (different weight and bias), convolution after the different feature map, extract different features, get the corresponding specialized neuro.

Iv. to understand the 1*1 convolution kernel from the angle of fully-connected layers

Consider it as a fully connected layer

The 6 neurons on the left, respectively, are A1-a6, and become 5 after the full connection, respectively, of the B1-B5

6 neurons on the left are equivalent to the channels:6 in the input feature.

A new feature of the 5 neurons on the right, equivalent to the 1*1 convolution channels:5

W*h*6 on the left can be fully connected via 1*1*5 convolution cores.

In convolutional Nets, there is no such thing as "fully-connected layers". There are only convolution layers with 1x1 convolution kernels and a full connection table. –yann LeCun

Reference: one by one [1 x 1] convolution-counter-intuitively What is the effect of the useful/1x1 convolution nucleus.

Understanding of the 1*1 convolution kernel/How to understand the 1*1 convolution in convolution neural networks

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The role of 61x1 convolution nucleus? (with examples) __ Depth study

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The role of 61x1 convolution nucleus? (with examples) __ Depth study

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support