Vgg:very Deep convolutional NETWORKS for large-scale IMAGE recognition learning

Last Update:2017-09-03 Source: Internet

Author: User

Tags scale image

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

University of OxfordVisual Geometry Group(Vgg)Karen Simonyanand theAndrew Zissermanin -Papers published in the year. Paper Address:https://arxiv.org/pdf/1409.1556.pdf。with theAlexare used between layers and each layer.Poolinglayer separated, last three layersFCLayer(Fully Connectedfully connected layer). ButAlexNetEach layer contains only a singleconvolutionlayer,Vggeach layer contains multiple(+)aconvolutionlayer. AlexNetof theFilterthe size7x7(Very Large) andVggof theFilterthe size is3x3(minimum). It by reducingFilterincrease the number of layers to achieve better results. The following is a paper interpretation.

ABSTRACT

The influence of the depth of convolutional network on the precision of large-scale image recognition is studied . The main contribution is to use a very small (3x3) convolution filter to push the neural network level depth to 16-19 layers . the year ImageNet won the first and second place in the localisation and classification races respectively. at the same time, the model is well generalized for other datasets.

1 INTRODUCTION

This article describes the Another important aspect of the convnet architecture is design - depth. Many people try to improve the AlexNet proposed in the year to achieve better results , zfnet in the first convolution layer using smaller convolution ( Receptive window size) and smaller step size (Stride) 2, the other strategy is to intensively train and test the entire image on a multiscale scale.

2 convnet Configurations

be Ciresan et al. (2011); Krizhevsky et al. (2012). Inspire. for a fair test of the performance gains from depth,vggnet All layers are configured to follow the same principle.

2.1 ARCHITECTURE

inputFixed-size 224x224 RGB Image. Data preprocessing: Subtract on each pixelRGBthe mean value. In the convolution layer of smallFilterdimensions are3*3, some places use1*1the convolution,this1*1convolution can be seen as a linear transformation of the input channels. Convolution step(Stride)set to1Pixels,3*3the padding of the convolution layer (padding) is set to1a pixel. The pooling layer is made up ofmax-pooling,Total5layer, pooling is2*2, the step size is2. throughReluThe nonlinear processing is carried out to increase the nonlinear expression ability of the network. Normalization without local response(LRN), this standardization is notILSVRCimproves performance on datasets, but results in more memory consumption and computation time.

2.2 Configurations

2.3 DISCUSSION

with the Unlike AlexNet and zfnet ,vggnet uses a small convolution in the network. It is more advantageous to replace large filter with multiple small filter . For example, three 3*3 convolution instead of a 7*7 convolution , because there are ReLU after each layer , We combine three nonlinear rectifier layers instead of a single layer, which makes the decision function more differentiated. homogeneous networks such as Goodfellow et al 's one- layer network and the small filter used by googlenet.

3 Classification FRAMEWORK 3.1 TRAINING

The input crops from multi-scale training images zooms the original image to the minimum edge s>224 After the image is extracted 224*224crops, for training.

Mini-batch gradient Descent,batch size is ,momentum = 0.9, weight attenuation 0.0005.

Dropout in the top two fully connected layers. dropout ratio is set to 0.5.

3.2 Testing

Rescale to Dimensions Q, test in the network. Detailed papers are presented in detail.

3.3 Implementation DETAILS

The machine and system configuration and the training time are introduced.

4 Classification Experiments 4.1 Single scale EVALUATION

The first experiment proves that the local response normalisation in A-LRN network does not improve the performance of Model A. So in the deeper architectures (B–e) the author did not use normalisation.

training Data Set data promotion method scale jittering significantly improve the experimental results.

4.2 multi-scale EVALUATION

In contrast to table 3 , scale jittering can be used to evaluate multiple scales to improve the accuracy of classification. As shown in table 4 .

4.3 multi-crop EVALUATION

Table 5 shows the multi-cropping assessment and intensive assessment, together with the effect of the two. Single model By contrast with dense convnet evaluation , the effect is better, if combined with two methods, multi-cropping and intensive effect can also be improved a little.

4.4 Convnet FUSION

combined with the Sofamax output of multiple convolutional networks , multiple models are fused together to output results. The results are shown in table 6.

4.5 COMPARISON with the state of the ART

with the current compare the state of the ART model. Compared with the previous 12,13 network Vgg Advantage is obvious. With googlenet comparison single model good point,7 Network fusion is inferior to googlenet.

5 Conclusion

In this paper , the deep convolution neural network has good results in effect and generalization ability. Demonstrate the importance of depth for CV problems.

This article references

Https://arxiv.org/pdf/1409.1556.pdf

http://m.blog.csdn.net/muyiyushan/article/details/62895202

Vgg:very Deep convolutional NETWORKS for large-scale IMAGE recognition learning

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More