Target detection (object detection), which requires the model not only to determine what kind of target is included in an input image, but also to frame the target location (bounding box).
In order to achieve these requirements, the traditional method is to use the sliding Windows (Sliding window) way, with different scales (scale), proportional (aspect Ratio) window on the image slide, poor to enumerate all possible locations of sub-image blocks. These sub-image blocks are then entered into the target Recognition (object recognition) model for categorization. The amount of data in this way is huge, and usually an image needs to be segmented into about 10^6 sub-image blocks ...
The other is the OP method, relative to the sliding window method. The basic idea of such a method is to find some potential targets in the image, not to be exhaustive. These potential target input target recognition models are then categorized.
Next, the Bo mainly organized:
The Max Planck Institute (Max Planck Institute) (site here), has a study called What is makes for effective detection proposals? The performance of various OP is analyzed comprehensively.
op Methods at a glance (detection proposal methods):
The effect of the OP method is a checklist:
the ability to reproduce various OP methods (repeatability):
The author thinks that a good Op method should have better reproducibility, that is, the object retrieved from a similar image should be consistent. The method of verification is to do various disturbances to Pascal's picture (Figure 2), and then see if it can also detect the same object recall is how much, according to the strict IOU can get a curve, and finally calculate the area under the curve repeatability.
Analyzing the above figure, the author thinks that Bing and edgeboxes algorithm repeatability better. The reason may be that both algorithms use SVM. In addition, the author also thinks that the sensitivity of hyper-pixel (superpixels) is the main reason for the decrease of the reproducibility of some op algorithms.
Recall (Recall):
or directly on the conclusion:
MCG, Edgebox,selectivesearch, rigor and geodesic perform well in different proposal numbers if you only limit the proposal,mcg,endres and cpmc effects that are less than 1000 the best If the position of the candidate box is not well positioned at the beginning, as the IOU standard is strict, the recall will fall faster including Bing, Rahtu, Objectness and edgeboxes. The decline in Bing is particularly noticeable. In AR this standard, MCG performance is stable, endres and edgeboxes in less proposal time performance is better, when allow more proposal time, rigor and selectivesearch performance will be better than others. On Pascal and Imagenet, each Op method is similar, which shows that the performance of these OP methods is good generalization.
The effect of various OP methods in actual detection tasks:
The actual detection is that after OP, the OP results are entered into the detector for identification. The author uses two famous detector: one is Lm-llda and the other is r-cnn. The authors extracted 1K proposal with various OP methods.
Full Text summary:
For the repeatability standard, the current OP method is generally effective. It is possible to improve the repeatablilty of the OP method by more robust features of noise and disturbances. But repeatability low does not mean that the last map is low, such as Selectivesearch, so the end is to see the application scenario. If the OP method is positioned more accurately, the help for the classifier becomes larger. So for the OP method, a IOU of 0.5 recall is not a good standard. High recall but the positioning is not accurate, will hurt the last map Mcg,seletivesearch,edgeboxes,rigor and geodesic is currently the best performance of the 5 methods, the speed of edgeboxes and geodesic for excellent. The current Op method is similar in both VOC07 and imagenet, indicating that they all have good generalization performance. Full-text discussion:
If the computational power goes up, is op still useful? The authors believe that the sliding window plus the strong classifiers such as CNN will have a better effect if the computational performance permits. The author observes that the features used in the current OP (such as Object boundary and Superpixel) are not used in the classifier, and there are no other op-like features in the OP method other than Multibox. The authors look forward to a job that combines the advantages of both. Finally, the author makes a three-point guess: Then top down may play a more important role in OP, and the link between OP and detector will be tighter later, and the op-generated segmentation mask will play a more important role.