CS231N Spring LECTURE9 Lecture Notes

Source: Internet
Author: User

Refer to "Deeplearning.ai convolutional neural Network Week 2 Lecture Notes".

1. AlexNet (Krizhevsky et al. 2012), 8-layer network.

Learn to calculate the shape of the output for each layer: for the convolution layer, the edge length of the output = (input side length-filter side length)/step + 1, the output number of channels equals the number of filter. The number of channels per filter equals the number of channels entered. The parameters of the convolution layer = the width of the filter's length * filter * The number of channels of the input * filter. There are no parameters to learn in the pooling layer.

The figure is divided into two channels for processing on a different GPU.

The 2013 Zfnet continued the Alexnet architecture (also a 8-layer network), optimizing parameters and achieving better results (error rates from 16.4% to 11.7%).

2. Vggnet (Simonyan and Zisserman, 2014), 16~19 layer network.

The filter concatenation of the three 3*3 is equivalent to a 7*7 filter, and the advantage of a smaller filter is to increase the depth of the network, increasing the nonlinearity and less parameters.

3. Googlenet (Szegedy et al., 2014)

The Inception module uses different filter (1*1,3*3,5*5,pooling) at the same time and stacks up the results. The disadvantage of this is that the computational changes are large. The solution is to first use the 1*1 convolution compression channel number (refer to "Deeplearning.ai convolutional neural network Week 2 Lectures notes").

4. ResNet (He et al., 2015), 152-layer network.

Solve the problem of deep network difficult to optimize.

For the depth of the network (resnet-50+), similar to googlenet with 1*1 convolution layer to compress the number of channels to improve efficiency.

5. Comparison of complexity

6. Some other networks

Network in Network (NiN) (Lin et al., 2014): inspired the "bottleneck" layer of googlenet and resnet (1*1 convolutional layer).

Identity Mappings in deep residual Networks (He et al.): ResNet improvements.

Wide residual Networks (Zagoruyko et al., 2016): It is important to think of residuals, not depth. Increasing the width rather than the depth will make the calculation more efficient. The 50-storey wide resnet is better than the 152-storey original resnet.

Resnext (Xie et al., 2016): Also the idea of increasing the width, and the inception module very similar.

Deep Networks with Stochastic Depth (Huang et al., 2016): In order to solve the problem of gradient vanishing, randomly drop some layers. Use the entire network in the test phase, without drop any layers.

Fractalnet (Larsson et al., 2017): It is not necessary to think of residual, the important thing is the effective transfer of shallow to deep (transitioning), the training phase is also randomly drop off some layers, the test phase does not drop any layer.

Densely Connected convolutional Networks (Huang et al., 2017): In order to solve the problem of gradient disappearance, each layer is more densely connected to other layers.

Squeezenet (Landola et al., 2017): less parameters, better accuracy.

7. Summary

Vgg, Googlenet, ResNet are widely used and are now integrated into various ready-made frameworks.

ResNet is today's best, default option.

The trend is increasingly deep in the network.

Many studies focus on the connection between the design layer and the layer, in order to improve the propagation of gradients.

The latest research is in the depth and breadth of the controversy, as well as the necessity of residual.

CS231N Spring LECTURE9 Lecture Notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.