"Turn" Caffe preliminary Examination (vii) other commonly used layers and parameters

Source: Internet
Author: User

This article explains some of the other common layers, including: Softmax-loss layer, Inner product layer, accuracy layer, reshape layer and dropout layer, and their parameter configuration.

1, Softmax-loss

The Softmax-loss layer and the Softmax layer are calculated roughly the same. Softmax is a classifier that calculates the probability of a class (likelihood) and is a generalization of the logistic regression.

The Logistic regression can only be used for two classifications, while Softmax may be used for multiple classifications.

The difference between Softmax and Softmax-loss:

Softmax Calculation formula:

and Softmax-loss Calculation formula:

A more specific introduction to the difference between the two can be found in: Softmax vs. Softmax-loss

The user may end up with the probability of getting the probabilities of each category, which requires only one softmax layer, not necessarily the softmax-loss operation, or how the user has obtained some probability likelihood value by other means, and then makes the maximum likelihood estimate, At this point, only the softmax-loss is needed, and the previous Softmax operation is not required. It is therefore much more flexible to provide two different layer structures than to provide only one softmax-loss layer that fits together.

Whether it is a softmax layer or a soft-loss layer, there are no parameters, only the layer type is different.

Softmax-loss Layer: Output loss value

 layer {name:   " loss    type:   " softmaxwithloss  "   bottom:   ip1   "  bottom:  "  label    top:   " loss  "  
layers {    "cls3_fc"    "prob "      " prob "     Type: "Softmax"  }  

2. Inner Product

Fully connected layer, the input is treated as a vector, and the output is a simple vector (the width and height of the input data blobs all become 1).

Input: N*c0*h*w

Output: n*c1*1*1

The fully connected layer is actually a convolution layer, except that its convolution cores are the same size as the original data. Therefore, its parameters are basically the same as the convolution layer parameters.

Layer Type:innerproduct

lr_mult: The coefficient of the learning rate, the final learning rate is the number multiplied by the BASE_LR in the Solver.prototxt configuration file. If there are two Lr_mult, then the first one represents the weight of the learning rate, the second one indicates the bias of the learning rate. The learning rate of the general bias is twice times that of the weight learning rate.

Parameters that must be set:

  num_output: Number of filters (filter)

Other parameters:

  Weight_filler: Initialization of weight value. The default is "constant", the value is all 0, many times we use the "Xavier" algorithm to initialize, can also be set to "Gaussian"

  Bias_filler: The initialization of the offset. The general setting is "constant" and the value is all 0.

  bias_term: Whether to turn on bias, default to True, turn on

Layer {name:"ip1"Type:"innerproduct"Bottom:"pool2"Top:"ip1"param {lr_mult:1} param {lr_mult:2} inner_product_param {num_output:500Weight_filler {type:"Xavier"} bias_filler {type:"constant"      }    }  }  

3, Accuracy

The output categorical (predictive) accuracy is only available in the test phase, so you need to include the include parameter.

Layer Type:accuracy

 layer {name:   " accuracy    type:   " accuracy  "   bottom:   ip2   "  bottom:  "  label    top:   " accuracy  "   include {phase:test}}  

4, reshape

Changes the dimension of the input without changing the data.

Layer type:reshape

First look at the example

Layer {name:"Reshape"Type:"Reshape"Bottom:"input"Top:"Output"Reshape_param {shape {dim:0#copy the dimension from belowDim:2Dim:3Dim:-1#infer it from the other dimensions      }      }    }  

There is an optional parameter group shape that specifies the values for each dimension of the BLOB data (Bolb is a four-dimensional data: n*c*w*h).

dim:0 Indicates that the dimension is unchanged, that is, the input and output are the same dimensions.

Dim:2 or Dim:3 turns the original dimension into 2 or 3

Dim:-1 indicates that the dimension is automatically calculated by the system. The total amount of data is constant, and the system calculates the dimension values of the current dimension automatically based on the other three dimensions of the BLOB data.

Assuming that the original data is: 64*3*28*28, a color image representing 64 3-channel 28*28

After reshape transformation:

Reshape_param {        shape {          dim:0           dim:0                    -1         }      }  

Output data is: 64*3*14*56

5, dropout

Dropout is a trick to prevent overfitting. The weights of some hidden layer nodes of the network can be randomly left out of work.

First look at the example:

Layer {name:"DROP7"Type:"Dropout"Bottom:"Fc7-conv"Top:"Fc7-conv"Dropout_param {dropout_ratio:0.5}}layer {name:"DROP7"Type:"Dropout"Bottom:"Fc7-conv"Top:"Fc7-conv"Dropout_param {dropout_ratio:0.5    }  }  

Just need to set a dropout_ratio on it.

    

"Turn" Caffe preliminary Examination (vii) other commonly used layers and parameters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.