[Caffe] Source analysis of the layer

Source: Internet
Author: User
Tags prefetch

http://imbinwang.github.io/blog/inside-caffe-code-layer/
Bin Wang
About Archive

June 30, 2015
8 minute Read

Layer (layer) is the largest and most complicated module in Caffe, it is the Basic computing unit of network. Because Caffe emphasizes modular design, only each layer is allowed to complete a class of specific computations, such as convolution operations, pooling, non-linear transformations, internal product operations, and data loading, normalization, and loss calculations. Module Description

The input data for each layer comes from some ' bottom ' blobs, outputting some ' top ' blobs. The parameter descriptions for each type of layer in Caffe are defined in the Caffe.proto file, and the specific layer parameter values are defined in the application protocals buffer network structure description file. For example, the parameter descriptions for the convolution layer (convolutionlayer) are defined in the Caffe.proto as follows,

In Caffe.proto//message This stores parameters used by Convolutionlayer message convolutionparameter {optional UIn T32 num_output = 1; The number of outputs for the layer optional BOOL Bias_term = 2 [default = True]; Whether to have bias terms//Pad, kernel size, and stride are all given as a single value for equal//dimensions
  in height and width or as Y, X pairs. Optional UInt32 pad = 3 [default = 0]; The padding size (equal in Y, X) Optional UInt32 pad_h = 9 [default = 0]; The padding height optional uint32 pad_w = ten [default = 0]; The padding width optional uint32 kernel_size = 4; The kernel size (square) Optional UInt32 kernel_h = 11; The kernel height optional uint32 kernel_w = 12; The kernel width optional uint32 group = 5 [default = 1]; The group size for group CONV Optional UInt32 stride = 6 [default = 1]; The STRIDE (equal in Y, X) Optional UInt32 stride_h = 13; The Stride height optional UInt32 stride_w = 14; The Stride width Optional Fillerparameter weight_filler = 7; The filler for the weight optional fillerparameter Bias_filler = 8;
    The filler for the bias enum Engine {DEFAULT = 0;
    Caffe = 1;
  CUDNN = 2;
Optional Engine Engine = [default = default]; }

The parameter description includes the number, size and step length of the convolution kernel. In the Examples\mnist\lenet_train_test.prototxt network structure description file, a specific convolution layer (Convolutionlayer) is defined in this way,

In Examples\mnist\lenet_train_test.prototxt
layer {
  name: "CONV1"//Layer's name Type
  : "convolution"/layer, Description of the specific implementation of the calculation
  bottom: "Data"/Layer input data blob name top
  : "CONV1"/Layer output data blob name
  param {//layer weights and bias related parameters
    Lr_ Mult:1
  }
  param {
    lr_mult:2
  }
  convolution_param {//convolution-layer convolution operations related parameters
    num_output:20
    Kernel_size:5
    stride:1
    weight_filler {
      type: "Xavier"
    }
    Bias_filler {
      type: " Constant "}}}

The input and output structure of the layer:
Bottom blob-> conv layer-> top blob

Each type of layer needs to define three key operations Layersetup, Forward, backward:

Layersetup: Initializing the connection of layer and layer during network construction
Forward: Network data forward transfer, given bottom input data, compute output to top
backward: Network error reverse transmission, given top gradient, Computes the gradient of the bottom and stores it to the bottom blob
Implementation Details

There are 7 header files associated with layer in the Caffe.

LAYER.HPP: Parent class layer, defines the basic interface for all layer.
DATA_LAYERS.HPP: Inherits from the parent class layer, defines the child layer associated with the input data operation, such as Datalayer,hdf5datalayer and Imagedatalayer.
VISION_LAYERS.HPP: Inherits from the parent class layer, defines the child layer associated with the feature expression, such as Convolutionlayer,poolinglayer and Lrnlayer.
NEURON_LAYERS.HPP: Inherits from the parent class layer, defines the sub layer associated with the Non-linear transformation, such as Relulayer,tanhlayer and Sigmoidlayer.
LOSS_LAYERS.HPP: Inherits from the parent class layer, defines the sub layer associated with the output error calculation, such as Euclideanlosslayer,softmaxwithlosslayer and Hingelosslayer.
COMMON_LAYERS.HPP: Inherits from the parent class layer, defines the child layer associated with the intermediate result data distortion, element-by-action, such as Concatlayer,innerproductlayer and Softmaxlayer.
Layer_factory.hpp:Layer Factory mode class, which is responsible for maintaining the mapping table of existing layer and corresponding layer construction methods.

Each layer, depending on its requirements, defines the implementation of a CPU or GPU version, such as Convolutionlayer CPU and GPU implementations defined in two files Conv_layer.cpp, conv_layer.cu. Parent class Layer

LAYER.HPP defines the basic interface of the layer, the member variable,

Protected:
  /** The protobuf that stores layer parameters
  ///Layer description parameter, read
  from protocal buffers format Network structure description file Layerparameter LAYER_PARAM_;
  /** the Phase:train or test
  ///layer State, participate in the network training or testing
  phase Phase_;
  /** the vector that stores the learnable parameters as a set of blobs.
  ///layer weights and bias parameters, the use of vectors is because the weight of the parameters and bias is stored separately in two blobs of
  vector<shared_ptr<blob<dtype> > > blobs_;
  /** Vector indicating whether to compute the diff of each param blob. *
  ///Mark whether each top blob needs to calculate the reverse pass gradient value
  vector<bool> param_propagate_down_;

  /** the vector that indicates whether each top blob has a non-zero weight in
   * The  objective function. *
  / /Losslayer is zero, the weight vector<dtype> loss_ of the loss that represents the calculation of each top blob in Losslayer
  ;

Constructor and destructor,

/**
   * You should not implement your own constructor. Any set up code should go
   * to SetUp () where the dimensions of the bottom blobs are provided to the
   * layer.
   *///
display constructors do not need to be overridden, any initial work done in Setup () the completion
//Construction method copies only the values of the layer parameter descriptions, and also copies the explicit Layer if the layer description parameter provides weights and offset parameters
  (const layerparameter& param)
    : layer_param_ (param) {
      //Set phase and copy blobs (if there are any).
      Phase_ = Param.phase ();
      if (layer_param_.blobs_size () > 0) {
        blobs_.resize (layer_param_.blobs_size ());
        for (int i = 0; i < layer_param_.blobs_size (); ++i) {
          blobs_[i].reset (new blob<dtype> ());
          Blobs_[i]->fromproto (Layer_param_.blobs (i));
}} Virtual destructor
  Virtual ~layer () {}

Initialization function setup, each Layer object must follow a fixed invocation pattern,

  /** * @brief Implements Common layer setup functionality. * @brief Implement the Setup function for each Layer object * @param bottom input data for the preshaped input blobs * @param bottom layer, the storage space in the BLOB has been requested * @pa RAM Top * The allocated but unshaped output blobs of the to is shaped by reshape * @param the top-level data, blob objects to construct but the storage empty No application, * specific space size to be based on bottom blob size and LAYER_PARAM_ common decision, specific in the reshape function reality * * Checks that number of bottom and top B
   LOBs is correct. * Calls Layersetup to does special layer setup for individual layer, * types by followed to set up reshape of top B
   LOBs and internal buffers.
   * Sets up the loss weight multiplier blobs to any non-zero loss.
   * This method is overridden. * 1. Check whether the number of input and output blobs meets the requirements, and each layer can handle different input and output data * 2. Call the Layersetup function to initialize the special layer, and each layer subclass needs to override this function to complete the custom initialization * 3. Call the reshape function to allocate the appropriate amount of storage space for the top blob * 4. Sets the loss weight multiplier for each top blob, and the Losslayer top blob has a value of zero * * This method is not a virtual function without rewriting, pattern fixed/void SetUp (const vector<blob<dtype& gt;*>& Bottom, const vector<blob<dtype>*>& top) {checkblobcounts (bottom, top);
    Layersetup (bottom, top);
    Reshape (bottom, top);
  Setlossweights (top); }

Each subclass layer the initialization function layersetup that must be overridden,

  /** * @brief does layer-specific setup:your layer should implement this function * as very reshape. * @brief custom initialization, each subclass layer must implement this virtual function * * @param bottom * The preshaped input blobs, whose data fields store the  Input data for * This layer * @param bottom * input blob, data member Data_ and DIFF_ store related data * @param top * Allocated but unshaped output blobs * @param top * Export blob, blob object constructed but data member space not yet requested * * This method should D o One-time layer specific setup.
   This is includes reading * and processing relevent parameters from the <code>layer_param_</code>. * Setting up the shapes of the blobs and internal buffers should is done in * &LT;CODE&GT;RESHAPE&LT;/CODE&GT;, which W
   Ill be called before the "forward pass to * adjust" the top blob sizes. * This method performs a customized layer initialization, including reading from LAYER_PARAM_ and processing related layer weights and bias parameters, * Calling the reshape function to request the storage space of the top blob/virtual void Layersetup (const V
  ector<blob<dtype>*>& Bottom,    Const vector<blob<dtype>*>& top) {} 

Each subclass layer The reshape function that must be overridden to complete the set of top blob shapes and allocate storage space for them.

  /**
   * @brief Adjust the shapes of blobs and internal buffers to accomodate * The shapes of the        bottom blob S.
   * @brief compute the shape of the top blob and allocate storage space based on the shape and layer_param_ of the bottom blob
   *
   @param bottom The input blobs, with the requested Input Shapes
   * @param top blobs, which should is reshaped as needed
   * * This method
   should reshape to P blobs as needed according
   to the shapes * of the bottom (input) BLOBs, as OK as reshaping any internal buffers
  * and making any of the other necessary adjustments the layer
   accomodate can * bottom.
   * *
  virtual void reshape (const vector<blob<dtype>*>& Bottom,
      const VECTOR<BLOB<DTYPE >*>& top) = 0;

Forward propagation function forward and reverse propagation function backward,

Inline Dtype Forward (const vector<blob<dtype>*>& Bottom,
      const vector<blob<dtype>*> & top);
inline void Backward (const vector<blob<dtype>*>& top,
      const vector<bool>& Propagate_ Down,
      const vector<blob<dtype>*>& bottom);

These two functions are not virtual functions, they will call the following virtual function to complete data forward transmission and error transmission, according to the different execution environment each subclass layer must rewrite the CPU and GPU version,

virtual void forward_cpu (const vector<blob<dtype>*>& Bottom,
      const vector<blob<dtype>* >& top) = 0;
virtual void Forward_gpu (const vector<blob<dtype>*>& Bottom,
      const vector<blob<dtype>* >& top) {
    //LOG (WARNING) << "Using CPU Code as backup.";
    Return forward_cpu (bottom, top);
  }

virtual void backward_cpu (const vector<blob<dtype>*>& Top,
      const vector<bool>& Propagate_down,
      const vector<blob<dtype>*>& bottom) = 0;
 virtual void Backward_gpu (const vector<blob<dtype>*>& Top,
      const vector<bool>& Propagate_down,
      const vector<blob<dtype>*>& bottom) {
    //LOG (WARNING) << "Using CPU Code as backup. ";
    BACKWARD_CPU (top, propagate_down, bottom);
  

Layer serialization function, layer layer description parameters LAYER_PARAM_, layer weights and offset parameters Blobs_ copied to the Layerparameter object, easy to write to the disk,

Serialize layerparameter to protocol buffer
template <typename dtype>
void layer<dtype>: Toproto (layerparameter* param, bool write_diff) {
  param->clear ();
  Param->copyfrom (LAYER_PARAM_); Duplicate Layer description parameter Layer_param_
  param->clear_blobs ();
  Duplicate layer weights and offset parameters Blobs_
  for (int i = 0; i < blobs_.size (); ++i) {Blobs_[i]->toproto (
    param->add_blobs (), WRI Te_diff);
  }
Sub class data Layers

Data passes through the date layers into the Caffe data processing process, where they are located at the bottom of the network net. Data can come from efficient databases (LEVELDB or Lmdb), directly from memory, or disk files in HDF5 format or in common picture format when less efficient. Data layers inherits from layer,

The final subclass layer includes Datalayer,imagedatalayer,windowdatalayer,memorydatalayer,hdf5datalayer,hdf5outputlayer,dummydatalayer. This only analyzes DataLayer, and other data layers are similar.

First, look at the DataLayer layersetup implementation process, DataLayer inherits this method directly from the parent class Baseprefetchingdatalayer

In base_data_layer.cpp
template <typename dtype>
void baseprefetchingdatalayer<dtype>:: Layersetup (
    const vector<blob<dtype>*>& Bottom, const vector<blob<dtype>*>& Top ) {
  //1. Invoke Parent class Basedatalayer construction method,
  basedatalayer<dtype>::layersetup (bottom, top);
  Now, start the prefetch thread. Before calling Prefetch, we make two
  //Cpu_data calls-so this prefetch thread does not accidentally make
  //s Imultaneous Cudamalloc calls when the main thread is running. In some
  //GPUs This seems to cause failures if we did not.
  2. Access to the prefetch data space, here is to allocate the prefetch data in advance of the storage space
  this->prefetch_data_.mutable_cpu_data ();
  if (this->output_labels_) {
    this->prefetch_label_.mutable_cpu_data ();
  }

  3. Create a thread
  dlog (INFO) << "initializing prefetch" for prefetch data;
  This->createprefetchthread ();
  Dlog (INFO) << "Prefetch initialized";
}

The execution process is roughly:

1. Invoke the parent class Basedatalayer construct method,

    In base_data_layer.cpp
    template <typename dtype>
    void Basedatalayer<dtype>::layersetup (const vector<blob<dtype>*>& Bottom,
          const vector<blob<dtype>*>& top) {
      if ( Top.size () = = 1) {
        Output_labels_ = false;
      } else {
        Output_labels_ = true;
      }
      The subclasses should setup the size of bottom and top
      datalayersetup (bottom, top);
      Data_transformer_.reset (
          new datatransformer<dtype> (TRANSFORM_PARAM_, This->phase_));
      Data_transformer_->initrand ();
    }

Based on the number of top blobs to determine whether the output of the data label, to Output_labels_ assignment, and then call their own Datalayersetup method,

    In data_layer.cpp template <typename dtype> void datalayer<dtype>::D atalayersetup (const vector& Lt
      blob<dtype>*>& Bottom, const vector<blob<dtype>*>& top) {//Initialize DB
      Open the source database Db_.reset (Db::getdb (This->layer_param_.data_param (). Backend ()));
      Db_->open (This->layer_param_.data_param (). Source (), db::read);

      Cursor_.reset (Db_->newcursor ());
        Check If we should randomly skip a few data points if (This->layer_param_.data_param (). Rand_skip ()) {
        unsigned int skip = Caffe_rng_rand ()% This->layer_param_.data_param (). Rand_skip ();
        LOG (INFO) << "Skipping" << skip << "Data points";
        while (skip--> 0) {cursor_->next ();
      }//Read a data point, with the it to initialize the top blob.
    Reads a data object that is used to analyze the size of the data object's storage space and is not exported to the top blob Datum Datum;  Datum.

      Parsefromstring (Cursor_->value ());
      BOOL Force_color = This->layer_param_.data_param (). Force_encoded_color ();
          if ((Force_color && decodedatum (&datum, True)) | |
      Decodedatumnative (&datum)) {LOG (INFO) << "Decoding Datum";
      //image//preprocessing of data Objects int crop_size = This->layer_param_.transform_param (). Crop_size (); if (Crop_size > 0) {//is allocating storage space for the top blob and allocating storage space for prefetch data Top[0]->reshape (this->layer_param_.data_par
        AM (). Batch_size (), Datum.channels (), crop_size, crop_size); This->prefetch_data_.
        Reshape (This->layer_param_.data_param (), Batch_size (), Datum.channels (), crop_size, crop_size); This->transformed_data_.
      Reshape (1, datum.channels (), crop_size, crop_size);
            else {Top[0]->reshape (This->layer_param_.data_param (). Batch_size (), Datum.channels (),
 Datum.height (), Datum.width ());       This->prefetch_data_.
        Reshape (This->layer_param_.data_param (). Batch_size (), Datum.channels (), Datum.height (), Datum.width ()); This->transformed_data_.
      Reshape (1, datum.channels (), Datum.height (), Datum.width ()); LOG (INFO) << "Output data size:" << top[0]->num () << "," << Top[0]->chann
      Els () << "," << top[0]->height () << "," << top[0]->width (); Label if (this->output_labels_) {vector<int> label_shape (1, This->layer_param_.data_param ().
        Batch_size ());
        Top[1]->reshape (Label_shape); This->prefetch_label_.
      Reshape (Label_shape); }
    }

Open the data source database, read a data object, preprocess the data object, allocate storage space for the top blob, and allocate storage space for prefetching data.

2. Access the prefetch data space in order to allocate the storage space of prefetching data in advance.

3. Call the Createprefetchthread method to create a thread for prefetching data.

The work of the layer initialization is complete. Next look at the DataLayer forward implementation process, because DataLayer is at the bottom of the network, so there is no need to implement backward. DataLayer inherits the forward method directly from the parent class Baseprefetchingdatalayer, and only the CPU version FORWARD_CPU is implemented,

In base_data_layer.cpp template <typename dtype> void baseprefetchingdatalayer<dtype>::forward_cpu (con St vector<blob<dtype>*>& Bottom, const vector<blob<dtype>*>& top) {//, join the T
  Hread//Waiting for the thread's data prefetch to end Joinprefetchthread ();
  Dlog (INFO) << "Thread joined";
  Reshape to loaded data. Top[0]->reshape (This->prefetch_data_.num (), This->prefetch_data_.channels (), This->prefetch_data_.
  Height (), this->prefetch_data_.width ()); Copy the data//copies prefetching to top BLOBs caffe_copy (Prefetch_data_.count (), Prefetch_data_.cpu_data (), top[
  0]->mutable_cpu_data ());
  Dlog (INFO) << "Prefetch copied"; if (This->output_labels_) {caffe_copy (Prefetch_label_.count (), Prefetch_label_.cpu_data (), top[1]-&
  Gt;mutable_cpu_data ());
  }//Start a new prefetch thread//Create a Dlog to complete data prefetching (INFO) << "Createprefetchthread";
Createprefetchthread (); }

As you can see, the forward_cpu of DataLayer is to get the data from the data source in advance through another thread, to copy the prefetch data to the top blobs when needed, and to complete the forward propagation of the data.

P.S. Note that at the end of the Data_layer.cpp file, there are two macro functions,

Instantiate_class (DataLayer);
Register_layer_class (Data);

What they were used to do. Take a look at their definition,

------in COMMON.HPP------//Instantiate a class with float and double specifications. #define INSTANTIATE_CLASS (classname) \ char ginstantiationguard# #classname; \ template class classname<float>; \ template class classname<double>//------in COMMON.HPP------//------in LAYER_FACTORY.HPP------#define R Egister_layer_creator (type, CREATOR) \ static layerregisterer<float> g_creator_f_     # #type (#type, creator<float>); \ static layerregisterer<double> g_creator_d_# #type (#type, creator<double>) \ #define Register_layer_cla                                                    SS (type) \ template <typename dtype>                                                                            \ shared_ptr<layer<dtype> > creator_# #type # #Layer (const layerparameter& param) \ { \ return shared_ptr<layer<dtype> &G t; (New type# #LAyer<dtype> (param)); \} \ register_layer_creator (Type, creator_ # #type # #Layer)//------in LAYER_FACTORY.HPP------

Where Instantiate_class (DataLayer) is used to instantiate the DataLayer class template, Register_layer_class (Data) is used to Layer_ Factory registers the DataLayer constructs the method, facilitates directly obtains the layer's object directly through the layer's name (Data). The built-in layers in the Caffe add these two macros at the end of the implementation code. Subclass Vision Layers

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.