Minimalist notes Deepid-net:object detection with deformable part Based convolutional neural Networks

Source: Internet
Author: User
Minimalist notes Deepid-net:object detection with deformable part Based convolutional Neural Networks

Paper Address Http://www.ee.cuhk.edu.hk/~xgwang/papers/ouyangZWpami16.pdf

This is the CUHK Wang Xiaogang group 2017 years of a tpami, the first hair in the CVPR2015, increased after the experiment to cast the journal, so the contrast experiment are some alexnet,googlenet and other early network models, FASTER-RCNN has not yet appeared. This article was selected because you wanted to see how deformable part method (DPM) combined with CNN.

Article Core contribution: 1. New target detection network architecture; 2. Modified the Pretrain setting to improve performance; 3. Replace max-pooling layer with def-pooling layer, which is a combination of DPM and CNN. Pipeline See figure

The author thinks that it is difficult to classify objects in the box only when they are tested, such as a small volleyball, which may be confused with the texture of the swimming cap that the swimmer wears on his head. At this time need the whole picture of the global information, when found volleyball in the volleyball court, swimming cap appears in the pool, then detection classification will be more accurate, and not because of local texture and be misled.

Many of the detection networks are now classified tasks on the Pretrain, the article believes that these two tasks are very different K, classification tasks need to be insensitive to the location scale, and the detection task on the location scale sensitive, so can not directly mechanically. The article uses the 1000 class data of imagenet Cls-loc to carry on the Pretrain, then carries on the fine-tuning in the 200 kind of test data set, obtains the better effect. The

article holds that every channel in the CNN middle tier is actually a response graph of a part of an object. The HOG+DPM process is very similar, so the author adds the idea of DPM to CNN and proposes the def-pooling layer for DPM computing. The feature map for the C channel is Mc M C M_c, its first (I,J) (I, J) (I,j) pixel is M (i,j) c m C (i, J) m_c^{(I,j)}, and the response value is M (x,y) c m C (x, y) m_c^{(X,y)}. The Anchor center coordinates are (x,y) (x, y) (x,y), the pixels on the anchor are offset (δx,δy) (Δx, Δy) (\delta_x,\delta_y), and the absolute coordinates of the offset pixels are zδx,δy= (x,y) t+ (δx,δ Y) T zδx, Δy = (x, y) T + (Δx, Δy) T z_{\delta_x,\delta_y}= (x,y) ^t+ (\delta_x,\delta_y) ^t. Φ (δx,δy) =

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.