[CVPR 2016] Weakly supervised deep Detection networks paper notes

Last Update:2018-04-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Weakly supervised deep Detection Networks,hakan Bilen,andrea Vedaldi

Https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Bilen_Weakly_Supervised_Deep_CVPR_2016_paper.pdf

Highlight

The problem of weak supervisory detection is interpreted as proposal sorting, and a comparatively correct sort is obtained by comparing all proposal categories, which is consistent with the calculation method of evaluation standard in testing.

Related work

The MIL strategy results in a non-convex optimization problem; In practice, solvers tend to get stuck in local optima

Such the quality of the solution strongly depends on the initialization.

Developing various initialization strategies [19, 5, 32, 4]

[Propose] a self-paced learning strategy
[5] Initialize object locations based on the objectness score.
[4] propose a multi-fold split of the training data to escape local optima.

On regularizing the optimization problem [31, 1].

[+] Apply Nesterov ' s smoothing technique to the latent SVM formulation
[1] Propose a smoothed version of MIL that softly labels object instances instead of choosing, the highest scoring ones.

Another line of the based for WSD is the identifying the similarity between image parts.

[Propose a] discriminative graph-based algorithm that selects a subset of Windows such so each window was connected to Its nearest neighbors in positive images.
[Extend] This method to discover multiple co-occurring part configurations.
[approx] propose an iterative technique/applies a latent semantic clustering via latent semantic analysis (PLSA)
[2] propose a formulation that jointly learns a discriminative model and enforces the similarity of the selected object re Gions via a discriminative convex clustering algorithm

Method

The method used in this paper is very simple and easy to understand, mainly divided into the following three parts:

Enter the results of the feature and region proposal into the spatial pyramid pooling layer, take out the area-dependent eigenvectors, and enter two FC tiers
Category: FC layer output by the Softmax classifier, the region category is calculated
Detection: FC layer output through the Softmax classifier, unlike the above is normalized when not with the category normalization, but with all areas of the fraction to be normalized, through the comparison between regions to find the region containing the most information of this category

A region R belongs to a Class C score, which is the product of the latter two parts
Full-image category score for all regions that belong to the category of the sum of the scores

The loss function of the training is as follows

The last item is a calibration item (slightly changed according to understanding, feeling the paper notation a bit of a problem), the purpose is to narrow the feature distance constrained by the smoothness of the solution (i.e., the proposal with the correct solution should also get high score).

Experimental results

In this paper, 4 kinds of model:s (vgg-f), M (vgg-m-1024), L (VGG-VD16) and Ens (the first three models of ensemble) are given according to the different basenet.

Ablation:

Object proposal

Baseline map:selective Search S 31.1%, M 30.9%, L 24.3%, Ens. 33.3%
Edge Box: +0~1.2%
Edge Box + Edge box score: +1.8~5.9%

Spatial Regulariser (compared with edge box + Edge box score) MAP +1.2~4.4%

VOC2007

MAP on Test:s +2.9%, M +3.3%, L +3.2%, Ens. +7.7% compared with [approx] + context
Corloc on Trainval:s +5.7%, M +7.6%, L +5%, Ens. +9.5% compared with [36]
Classification AP on test:s +7.9% compared with vgg-f, M +6.5% compared with vgg-m-1024, L +0.4% compared with vgg-vd16, Ens. -0.3% compared with vgg-vd16

VOC2010

MAP on test: +8.8% compared with [4]
Corloc on Trainval: +4.5% compared with [4]

Disadvantages

One obvious drawback of this article is that only one occurrence of a class of objects in a graph is considered (only the maximum and the surrounding boxes are limited in regulariser), which is also reflected in the failure cases in the text.

[CVPR 2016] Weakly supervised deep Detection networks paper notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[CVPR 2016] Weakly supervised deep Detection networks paper notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[CVPR 2016] Weakly supervised deep Detection networks paper notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support