[CVPR 2016] Weakly supervised deep Detection networks paper notes

Source: Internet
Author: User

Weakly supervised deep Detection Networks,hakan Bilen,andrea Vedaldi

Https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Bilen_Weakly_Supervised_Deep_CVPR_2016_paper.pdf

Highlight

    • The problem of weak supervisory detection is interpreted as proposal sorting, and a comparatively correct sort is obtained by comparing all proposal categories, which is consistent with the calculation method of evaluation standard in testing.

Related work

The MIL strategy results in a non-convex optimization problem; In practice, solvers tend to get stuck in local optima

Such the quality of the solution strongly depends on the initialization.

  • Developing various initialization strategies [19, 5, 32, 4]
      • [Propose] a self-paced learning strategy
      • [5] Initialize object locations based on the objectness score.
      • [4] propose a multi-fold split of the training data to escape local optima.
  • On regularizing the optimization problem [31, 1].
      • [+] Apply Nesterov ' s smoothing technique to the latent SVM formulation
      • [1] Propose a smoothed version of MIL that softly labels object instances instead of choosing, the highest scoring ones.
  • Another line of the based for WSD is the identifying the similarity between image parts.
    • [Propose a] discriminative graph-based algorithm that selects a subset of Windows such so each window was connected to Its nearest neighbors in positive images.
    • [Extend] This method to discover multiple co-occurring part configurations.
    • [approx] propose an iterative technique/applies a latent semantic clustering via latent semantic analysis (PLSA)
    • [2] propose a formulation that jointly learns a discriminative model and enforces the similarity of the selected object re Gions via a discriminative convex clustering algorithm

Method

The method used in this paper is very simple and easy to understand, mainly divided into the following three parts:

    • Enter the results of the feature and region proposal into the spatial pyramid pooling layer, take out the area-dependent eigenvectors, and enter two FC tiers
    • Category: FC layer output by the Softmax classifier, the region category is calculated
    • Detection: FC layer output through the Softmax classifier, unlike the above is normalized when not with the category normalization, but with all areas of the fraction to be normalized, through the comparison between regions to find the region containing the most information of this category
      • A region R belongs to a Class C score, which is the product of the latter two parts
      • Full-image category score for all regions that belong to the category of the sum of the scores

The loss function of the training is as follows

The last item is a calibration item (slightly changed according to understanding, feeling the paper notation a bit of a problem), the purpose is to narrow the feature distance constrained by the smoothness of the solution (i.e., the proposal with the correct solution should also get high score).

Experimental results

In this paper, 4 kinds of model:s (vgg-f), M (vgg-m-1024), L (VGG-VD16) and Ens (the first three models of ensemble) are given according to the different basenet.

  • Ablation:
      • Object proposal
        • Baseline map:selective Search S 31.1%, M 30.9%, L 24.3%, Ens. 33.3%
        • Edge Box: +0~1.2%
        • Edge Box + Edge box score: +1.8~5.9%
      • Spatial Regulariser (compared with edge box + Edge box score) MAP +1.2~4.4%
  • VOC2007
      • MAP on Test:s +2.9%, M +3.3%, L +3.2%, Ens. +7.7% compared with [approx] + context
      • Corloc on Trainval:s +5.7%, M +7.6%, L +5%, Ens. +9.5% compared with [36]
      • Classification AP on test:s +7.9% compared with vgg-f, M +6.5% compared with vgg-m-1024, L +0.4% compared with vgg-vd16, Ens. -0.3% compared with vgg-vd16
  • VOC2010
      • MAP on test: +8.8% compared with [4]
      • Corloc on Trainval: +4.5% compared with [4]

Disadvantages

One obvious drawback of this article is that only one occurrence of a class of objects in a graph is considered (only the maximum and the surrounding boxes are limited in regulariser), which is also reflected in the failure cases in the text.

[CVPR 2016] Weakly supervised deep Detection networks paper notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.