< turn > Convolution neural Network How to learn the invariant characteristics of translation

Last Update:2016-10-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

After some thought, I don't believe that pooling operations is responsible for the translation invariant property in CNN S. I believe that invariance (at least to translation) are due to the convolution filters (not specifically the pooling) an D due to the fully-connected layer.

For instance, let's use the Fig. 1 as reference:

The blue volume represents the input image, while the green and yellow volumes represent layer 1 and Layer 2 output Activa tion volumes (see cs231n convolutional neural Networks for Visual recognition If your is not familiar with these Volum ES). At the end, we had a fully-connected layer that was connected to all activation points of the yellow volume.

These volumes is build using a convolution plus a pooling operation. The pooling operation reduces the height and width of these volumes, while the increasing number of filters in each layer Increases the volume depth.

For the sake of the argument, let's suppose that we had very "ludic" filters, as show in Fig. 2:

The first layer filters (which would generate the green volume) detect eyes, noses and other basic shapes (in Rea L CNNs, first layer filters would match lines and very basic textures);
The second layer filters (which would generate the yellow volume) detect faces, legs and other objects that is Aggrega tions of the first layer filters. Again, this is a example:real life convolution filters may detect objects, which has no meaning to humans.

Now suppose this there is a face at one of the corners of the the image (represented by, red and a magenta point). The eyes is detected by the first filter, and therefore would represent the activations at the first slice N volume. The same happens for the nose, except that it's detected for the second filter and it appears at the second slice. Next, the face filter would find that there is eyes and a nose next to each other, and it generates an activation at t He yellow volume (within the same region of the the face at the input image). Finally, the fully-connected layer detects that there are a face (and maybe a leg and an arm detected by other filters) and It outputs that it had detected an human body.

Now suppose that the face have moved to another corner of the image, as shown in Fig. 3:

The same number of activations occurs in this example, however they occur in a different region of the green and yellow VO Lumes. Therefore, any activation in the first slice of the yellow volume means that a-face is detected, independently of T He face location. Then the fully-connected layer was responsible to ' translate ' a face and a human body. In both examples, an activation is received at one of the fully-connected neurons. However, in each example, the activation path inside the FC layer is different, meaning that a correct learning at the FC Layer is essential-ensure the Invariance property.

It must is noticed that the polling operation only "compresses" the activation volumes, if there is no polling in this ex Ample, an activation in the first slice of the yellow volume would still mean a face.

In conclusion, what makes a CNN invariant to object translation is the architecture of the neural network:the convolution Filters and the fully-connected layer. Additionally, I believe that if a CNN was trained showing faces only at one corner, during the learning process, the fully- Connected layer may become insensitive to faces on other corners.

Source

Https://www.quora.com/How-is-a-convolutional-neural-network-able-to-learn-invariant-features/answer/Jean-Da-Rolt

< turn > Convolution neural Network How to learn the invariant characteristics of translation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

< turn > Convolution neural Network How to learn the invariant characteristics of translation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

< turn > Convolution neural Network How to learn the invariant characteristics of translation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support