Sppnet paper Translation-spatial pyramid pooling spatial Pyramid Pooling in deep convolutional Networks for Visual recognition

Source: Internet
Author: User


Original address

I have translated the main parts of an important work on object detection, sppnet, in the paper. Sppnet's original intention is very clear, is that the network to the size of the input is more flexible, analysis to the convolutional network size is not required, the requirements of the fixed size is entirely from the whole connection layer, so the use of spatial pyramid pooling method to connect the two, The important contribution of sppnet in the field of detection is to avoid the problems of r-cnn deformation, repetition calculation and so on, which greatly improves the recognition speed in the case of no attenuation of the effect.
Deep convolution network spatial pyramid pooling method for visual recognition spatial Pyramid Pooling in depth convolutional Networks for visual recognition kaiming He, Xiangyu Zhang, shaoqing Ren, and Jian SunAbstract the current depth convolutional neural network (CNNS) requires the input image size to be fixed (e.g. 224x224). This artificial need leads to reduced recognition accuracy when faced with images or sub-images of arbitrary size and proportions. In this article, we give the network a pooling strategy called "Spatial pyramid Pooling" (spatial pyramid pooling) to eliminate these limitations. This network structure, which we call spp-net, can produce a fixed-size representation (representation) without caring about the size or scale of the input image. Pyramid pooling is very robust to the deformation of objects. Thanks to its many advantages, spp-net can generally help improve various types of image classification methods based on CNN. On the ImageNet2012 dataset, Spp-net has dramatically increased the accuracy of various CNN architectures, although these architectures have their own different designs. On the Pascal VOC 2007 and Caltech101 datasets, the spp-net uses a single full-image representation to achieve the best results without tuning. Spp-net is also prominent in object detection. With Spp-net, you only need to calculate a feature map (feature map) from the entire picture, then feature pool any size (sub-image) to produce a fixed-size representation for training the detector. This method avoids the iterative calculation of convolution characteristics. When processing the test image, our method is 24-102 times faster than the R-cnn method on the VOC2007 dataset, which achieves the same or better performance. On the Imagenet large-scale Vision recognition Task Challenge (ILSVRC) 2014, our approach ranked 2nd in object detection and 3rd in object classification, with a total of 38 teams. This article also describes some of the improvements made for this competition.Continue reading (best reading experience)

Sppnet paper Translation-spatial pyramid pooling spatial Pyramid Pooling in deep convolutional Networks for Visual recognition

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.