Understanding the Deepbox algorithm

Source: Internet
Author: User

Understand the basics of Deepbox algorithms

The paper is published in ICCV2015, the author is Berkeley doctoral student Weicheng Kuo:

@inproceedings{KuoICCV15DeepBox,    Author = {Weicheng Kuo, Bharath Hariharan, Jitendra Malik},    Title = {DeepBox:Learning Objectness with Convolutional Networks},    Booktitle = {International Conference on Computer Vision ({ICCV})},    Year = {2015}}

Code Open source on GitHub: Https://github.com/weichengkuo/DeepBox

The paper mainly did one thing: with a convolutional net, the Bottom-up method (mainly Edgebox) produced proposal re-scoring (Re-rank). That is, the proposal region, which is produced by Edgebox, is reordered to give the exact region a higher objectness score.

A bit more straightforward: each propsal produced by Edgebox has scores, scores are high and low, but sometimes the most accurate box score is not high, high-scoring box is not allowed to use a convolutional net to correct this score.

The proposed method

The first is to use Edgebox and other traditional bottom-up methods to extract the proposal area, and then into a small network to do training/inference.

Therefore, claim in the paper, compared to edgebox in the accuracy of the promotion, this good understanding, after all, stepping on the shoulders of predecessors, it is precisely because of stepping on the shoulders of the predecessors so the time overhead should be edgebox 0.25s+ convolutional network inference time, The original text simply claim the time overhead on the network, anyway it is slower than Edgebox.

The use of the network, the author said also tried VGG16, AlexNet, but in fact, with a smaller 4-layer network can be almost, so down speed, precision almost the same, then decisive small network.

Network Training Method: The original text mentions, Deepbox's 4 layer Small network needs two stage training. The first stage, the sample is sliding window generated, is easy samples, and the network in fact the first two layers still need to be initialized from Alexnet, the second stage is based on the results of the previous stage to do Finetune, The sample is replaced with the proposal region produced by the Edgebox, and the BBGT is still calculated IOU to distinguish positive/negative samples, except that the IOU threshold is changed from 0.5 to 0.3/0.7 compared to the first stage.

As for the mentioned fast dbox, it seems normal from now on, because Sppnet and fast rcnn have taken such a way to save the calculation: the whole map into the convolutional network instead of each region as a network input, commonly known as feature sharing, in fact, it is still region proposal too much , the overall overlap is also many.

At the end of the paper, it is also mentioned that only one stage of training is actually possible (the second stage only). Look at the code is also true, is from the CaffeNet.v2.caffemodel to do the first two levels of network initialization. This CaffeNet.v2.caffemodel is actually from the imagenet_models.tgz provided by the fast R-CNN Open source code from RBG (and PY-FASTER-RCNN does not provide this file is another matter).

Open box Use

Well, actually I just wanted to see what the code ran like. The supplied code is run fast dbox by default, and the source says that this fast version only provides training and test results for Mscoco datasets. Edgebox pre-processing data need to download from the Berkeley School, the domestic network download it unusually hard, open the Thunderbolt under the fixed and turn off the members on the contrary can be.

The code can be said to be fork from fast r-cnn and add and modify something. Some of the prepared data is placed in the. mat file, and it is found that the version specified by-V7 when storing data in MATLAB can be read with the HDF interface, so Python uses the h5py package to read. Because of the Python language itself, after loading the data will not return memory to the operating system, the entire Mscoco 2014 of the data set in the Deepbox code training, the need to consume about 24~30g memory, it is horrible, the PC which has so much memory, And did not find the right server to use, fortunately, you can create swap files on Ubuntu and mount the way to manually increase the capacity of the swap partition, so that can run up.

Back to God.

In fact, Fast R-CNN's paper, although not mentioned, but open source code is said to use Edgebox as proposal Generator, and fast r-cnn simple understanding is proposal generator+alexnet+ Some other kinds of black magic, here alexnet to some extent as a classifier. So it seems Deepbox is to do proposal region Re-rank, but how to see a bit fast R-CNN the multi-classification network into an object or not two classification network, the output result is "Better region proposal", and then can " To fast r-cnn "to do further classification, is the classification of the network made a simple cascade, but the first level is a rough classification is the objectness score, the second level is the fine score. In this view, insight may be in, see through the Edgebox and other manual design scoring mechanism algorithm. In fact, the author of the manual design Edgebox This scoring method is the expert Daniel. Since the manual design of the score is not accurate enough, it is somewhat arbitrary to determine whether an area is an object region based on the edge. Therefore, the Deepmask/sharpmask/fastmask series approach came into being, but Daniel is already considering the detailed mask proposal rather than the rough bounding Box proposal.

Understanding the Deepbox algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.