Segnet's Caffe Source improvement

Source: Internet
Author: User
Problem:

Segnet (Tpami 2017) The official release code is implemented under the Caffe framework. But the original Caffe code needs to be reformed, see Caffe-segnet-cudnn5. And the transformation of the Caffe himself in the use of the time encountered a less useful place: the need to be in the. Prototxt to specify the sample after the Upsample_w and Upsample_h, the following figure. This is done to avoid the situation: in the encoder, the upper layer of the size is odd (for example, 107), after pooling after the next layer into 54, but in decoder, 2x upsample, become 108, with the previous layer of the dims inconsistent. So you need to specify Upsample_w/upsample_h as 107. This leads to the result that once both Upsample_w and Upsample_h are specified, the input size of all data is required to be consistent, which is cumbersome for data sets (such as Ms COCO) that are not exactly the same size.

Solutions:

Based on this question, I hope that for every mini-batch, the network can automatically do resize. And because the Upsample layer is the original author in the Caffe inside a new layer, all we need to modify Caffe C + + source. The steps are as follows:

1, positioning to the original author of the new Upsample layer: INCLUDE/CAFFE/LAYERS/UPSAMPLE_LAYER.HPP and src/caffe/layers/upsample_layer.cpp. The main is the Upsamplelayer<dtype>::reshape function within the. cpp;

2, the study of this code found that you only know the size of this layer, such as 54, can not know the previous layer is 107. So in this case, you can only do the first 2x upsampling, into 108. This size into the upper layer, that is, the size=107 layer, the beginning of the 5th, 6 sentence check_eq will detect the height and width are consistent, if inconsistent will be the error, such as detection of Bottom[0]->height () =108, bottom[ 1]->height () =107, so before Check_eq, the 108 resize is 107:

3, after this, will encounter a problem: This code is first resize make dims consistent, and then upsampling, so if the original size is 427x640, the next layer is 214x320. 214x320 layer upsampling into 428x640, it will not enter this code, that is, not resize to 427x640. The error is usually like this:

SOFTMAX_LOSS_LAYER.CPP:56] Check failed:outer_num_ *inner_num_ = = Bottom[1]->count () (273920 vs. 273280) Number of lab Els must match number of predictions; e.g., if softmax axis = = 1 and Prediction shape is (N, C, H, W), label count (number of labels) must being n*h*w, with Intege R values in {0, 1, ..., C-1}.

Here is a solution: After upsampling, when the size is found larger than the original image, such as 428>427, this obviously requires resize to 427. How do you determine if it is larger than the original image size? Simply pass in the original image and judge it at each upsample level (because the rest of the layer will certainly not be larger after the upsampling than it was after the final upsample layer). The original artwork can only be used in the. Prototxt as a bottom of the upsample layer:

Note that the 5 upsample layers here are to add bottom: "Label", which is the original size of the Label_map. And then:

Inside the bottom[2] that is the incoming label, the code means that upsampling size is 428, found greater than 427, then set 428 to 427.
In addition, it is necessary to note that the outer if is not the same as before, the previous if (upsample_h_ <= 0 | | | | upsample_w_ <= 0) will cause upsample_h_ and upsample_w_ after the first change, will not be based on different D The ATA size has changed.

4, after this, you will also encounter a problem:

LAYER.HPP:374] Check failed:exactnumbottomblobs () = = Bottom.size () (2 vs. 3) upsample layer takes 2 bottom blob (s) as InP Ut.

The problem is: because we added a bottom to the. Prototxt, but INCLUDE/CAFFE/LAYERS/UPSAMPLE_LAYER.HPP has set up: virtual inline int Exactnumbottomblobs () const {return 2;}, that is, to define a bottom number of 2, so change to 3:

That should be it. Summary

A total of 3 places: INCLUDE/CAFFE/LAYERS/UPSAMPLE_LAYER.HPP, a change, see above Train/test. prototxt, a few changes, see above Src/caffe/layers/upsample_ Layer.cpp, mainly Upsamplelayer<dtype>::reshape function, change more, the following release all the code of this function:

Template <typename dtype> void Upsamplelayer<dtype>::reshape (const vector<blob<dtype>*> & Bottom, const vector<blob<dtype>*>& top) {check_eq (4, Bottom[0]->num_axes ()) << "I
  Nput must have 4 axes, "<<" corresponding to (num, channels, height, width) "; Check_eq (4, Bottom[1]->num_axes ()) << "Input mask must have 4 axes," << "corresponding to (Num, chan
  Nels, height, width) ";
  Check_eq (Bottom[0]->num (), Bottom[1]->num ());

    Check_eq (Bottom[0]->channels (), Bottom[1]->channels ()); Keep dims of the Ori ' and Upsample ' same if (Bottom[0]->height ()!= bottom[1]->height () | | BOTTOM[0]-&G T;width ()!= bottom[1]->width ()) {Bottom[0]->reshape (Bottom[0]->num (), Bottom[0]->channels (), bottom[
    1]->height (), Bottom[1]->width ());
  } check_eq (Bottom[0]->height (), Bottom[1]->height ()); Check_eq (Bottom[0]->width (), Bottom[1]->width ()); Not Has_upsample_h && has_upsample_w if (!) ( This->layer_param_.upsample_param () Has_upsample_h () && This->layer_param_.upsample_param ()
        . Has_upsample_w ()) {upsample_h_ = Bottom[0]->height () * scale_h_-INT (pad_out_h_);
        Upsample_w_ = Bottom[0]->width () * scale_w_-INT (pad_out_w_); if (Upsample_h_ > bottom[2]->height () | | upsample_w_ > Bottom[2]->width ()) {Upsample_h_ = bottom[
            2]->height ();
        Upsample_w_ = Bottom[2]->width ();
    }//log (INFO) << "# # Upsample_h_" << Upsample_h_;
    LOG (INFO) << "# # upsample_w_" << upsample_w_;
    LOG (INFO) << "# # Bottom[2]->height ()" << bottom[2]->height ();

    LOG (INFO) << "# # Bottom[2]->width ()" << bottom[2]->width ();
    Upsampling Top[0]->reshape (Bottom[0]->num (), Bottom[0]->channels (), Upsample_h_, upsample_w_); LOG (INFO) << "# # Top[0]->height ()" << top[0]->height ();

  LOG (INFO) << "# # Top[0]->width ()" << top[0]->width ();
  Channels_ = Bottom[0]->channels ();
  Height_ = Bottom[0]->height ();
Width_ = Bottom[0]->width (); }
References:

[1] Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet:a deep convolutional encoder-decoder architecture for Image Segm Entation. IEEE Trans. Pattern anal. Mach. Intell. 39 (12) (2017) 2481–2495
[2] segnet official code: https://github.com/alexgkendall/SegNet-Tutorial
[3] Caffe-segnet-cudnn5:https://github.com/timosaemann/caffe-segnet-cudnn5

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.