Input:
Bottom[0]: nxkx1x1, N is the number of samples, K is the number of categories. is the predicted value.
BOTTOM[1]: nx1x1x1, N is the number of samples, when the category is K, the range of values for each element is [0,1,2,..., K-1]. It's Groundtruth.
Output:
TOP[0]: 1x1x1x1 dimension, to be hingeloss.
about Hingeloss:
P: Norm, default is L1 norm, can be set to L1 or L2 norm in configuration.
: Indicates the function, or 1 if the true label of the Nth sample is K.
Tnk:bottom[0] The nth sample, the predicted value of the K dimension.
Forward Propagation Code Analysis:
TemplatevoidHINGELOSSLAYER::FORWARD_CPU (Const Vector*>& Bottom,Const Vector*>& top) {Constdtype* Bottom_data = bottom[0]->cpu_data ();//Get Dim predictions for num Samplesdtype* Bottom_diff = bottom[0]->mutable_cpu_diff ();Constdtype* label = bottom[1]->cpu_data ();//Get Groundtruth of num samples intnum = bottom[0]->num ();intCount = bottom[0]->count ();intDim = Count/num; Caffe_copy (count, Bottom_data, Bottom_diff); for(inti =0; i < num; ++i) {//label[i] Stores the real class of the I sample, the range of values [0,1,2,..., K-1] The label[i of the K-dimensional predictor of the sample I is multiplied by-1 is equivalent to the calculation principle of Hingelosslayer layer in//caffe and source code AnalysisBottom_diff[i * Dim +static_cast(Label[i]) *= -1; } for(inti =0; i < num; ++i) { for(intj =0; J < Dim; ++J) {//Calculation Caffe Hingelosslayer layer principle and source code analysis, deposited in Bottom_diff, namely Bottom[0]->mutable_cpu_diff ()Bottom_diff[i * Dim + j] =STD:: Max (Dtype (0),1+ Bottom_diff[i * Dim + j]); }} dtype* loss = top[0]->mutable_cpu_data ();Switch( This->layer_param_.hinge_loss_param (). Norm ()) { CaseHINGELOSSPARAMETER_NORM_L1://l1 Normloss[0] = Caffe_cpu_asum (count, bottom_diff)/num; Break; CaseHINGELOSSPARAMETER_NORM_L2://l2 Normloss[0] = Caffe_cpu_dot (count, Bottom_diff, bottom_diff)/num; Break;default: LOG (FATAL) <<"Unknown Norm"; }}
principle of Reverse propagation:
Because Bottom[1] is a groundtruth, do not need to reverse the transmission, only need to bottom[0] to carry out, the reverse transmission is the loss of E-t bias.
Taking the L2 norm as an example, the biased derivative is:
Principle of Hingelosslayer layer in Caffe and source code analysis
which
Principle of Hingelosslayer layer in Caffe and source code analysis
Reverse propagation Source Analysis:
TemplatevoidHINGELOSSLAYER::BACKWARD_CPU (Const Vector*>& Top,Const Vector& Propagate_down,Const Vector*>& bottom) {if(propagate_down[1]) {LOG (FATAL) << This->type () <<"Layer cannot backpropagate to label inputs."; }if(propagate_down[0]) {dtype* Bottom_diff = bottom[0]->mutable_cpu_diff ();//Hinge mentioned in the description Constdtype* label = bottom[1]->cpu_data ();intnum = bottom[0]->num ();intCount = bottom[0]->count ();intDim = Count/num; for(inti =0; i < num; ++i) {//equivalent to seeking hinge* partial hinge/partial TNK partBottom_diff[i * Dim +static_cast(Label[i]) *= -1; }ConstDtype loss_weight = top[0]->cpu_diff () [0];Switch( This->layer_param_.hinge_loss_param (). Norm ()) { CaseHINGELOSSPARAMETER_NORM_L1://l1 Partial Reverse transferCaffe_cpu_sign (count, Bottom_diff, Bottom_diff);result of//L1 derivation: positive return 1 negative return-1 0 return 0Caffe_scal (count, Loss_weight/num, Bottom_diff);//scale, please . Break; CaseHINGELOSSPARAMETER_NORM_L2://L2 part of the reverse pass, is scale a bitCaffe_scal (count, Loss_weight *2/num, Bottom_diff); Break;default: LOG (FATAL) <<"Unknown Norm"; } }}
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Principle of Hingelosslayer layer in Caffe and source code analysis