talking about the difference between Train_val.prototxt and deploy.prototxt files in CaffeTags: Caffe deep learning caffenet 2016-11-02 16:10 1203 People reading reviews (1) Favorite report Category: Caffe (2)
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced. At the beginning of the study, think train_val.prototxt files and deploy.prototxt files very similar, then want to try to use Deploy.prototxt restore Train_ Val.prototxt file, so on the contrast, the level is limited, may be a lot of places say not in place, I hope the great God pointing criticism ~ ~
This article takes caffenet as an example:
1. Train_val.prototxt first, the Train_val.prototxt file is a network configuration file. This file is used during training. 2.deploy.prototxt The file is a file that is used during testing.
difference:First of all, the Deploy.prototxt files are deleted on the basis of the Train_val.prototxt file, and some things are formed. Due to the nature of the two files, the training part of the Train_val.prototxt file will be deleted in the Deploy.prototxt file.
In the Train_val.prototxt file, start by adding a training setup file and preparing the file. For example, mirror: true (open mirror) in Transform_param, crop_size: *** (image size);mean_file: "" (Solution to mean file), and Data_ Source in param: "" (Processed data training set file); batch_size: *** (number of images per batch of training pictures); Backend: lmdb (data Format setting). Next, there is a test setup for the training, and the settings for the test and training mode are set by a include{phase: test/train}. The next step is to set the test module contents. Then the other settings are the same as above, there is a batch_size can be slightly smaller, because the test does not require a lot of picture number. And the above piece of content in the deploy in the show only a data layer settings. Just set Name,type,top,input_param these. Next, the first convolutional layer is set, Train_ Val.prototxt file more param (Reverse propagation learning rate settings), here need to set two param a time weight learning rate, a time bias learning rate, of which the general bias learning rate is twice times the weight learning rate. The Convolution_param is then set, but in train_val it is necessary to initialize the Weight_filler and initialize the Bias_filler. The activation activation function is then set. This one is not initialized, so two files are the same. Next is the pool layer, because the pool is to reduce the resolution, so the two sides are the same, only need to set Kernel_size,stride,pool. No initialization of parameters is required. Again down the LRN layer, the full name of the layer is local response normalization (local response value normalization), the role of this layer is to perform a normalization of the local input, but now there are papers show that The effect of this layer on the result is not very large. But the definition of this layer is the same. Then the next is the "Conv2", "RELU2", "Pool2", "LRN2" such a cycle, specifically, as previously said, the main train_val is the initialization of parameters and learning rate settings. After the fifth convolutional layer, the "FC6" layer is entered, and the layer is the fully connected layer, where traiN_val inside is still more than two Param learning rate settings, and Weight_filler, Bias_filler initialization settings, and the two common is that there is an output vector element number of settings: Inner_product_param. The next step is to activate the function Relu. The next step is the dropout layer, which is designed to prevent model overfitting. One of the dropout_ration settings is typically 0.5. The next step is "Fc7", which is the same as "Fc6". Then the "Relu7" and "DROP7" are the same. Then the "Fc8" is the same as before. Next is accuracy, this layer is used to calculate the network output relative target value accuracy, it is not actually a loss layer, so there is no anti-pass operation. But in the Caffe official website, it is in the loss layer of this part. So in the Deploy.prototxt file, the definition of this layer is not. And then the last layer of Train_val is the "Softmaxwithloss" layer, which is also a simple definition of name,type,bottom,top. And this piece of content is not in the Deploy.prototxt file. A type: "Softmax" is defined directly in the Deploy.prototxt file.
By viewing the caffenet two files, the differences between the Deploy.prototxt file and the Train_val.prototxt file are found to be removed from the training section in many layers, and then the reverse Propagation Training section is removed.
Among them, there is a difference in the inside, that is why train_val inside is softmaxwithloss and deploy inside is the Softmax layer (two are the loss layer, there is no parameter): Here is actually softmax regression application, Only in the definition of Softmax directly calculated the probability of roommate forward part, and in the Softmaxwithloss part is also a part of the backward. So here is the difference, the specific difference can be seen in the two files of the C + + definition.
the left side of the table below is the Train_val.prototxt file, and the Deploy.prototxt file on the right.
Name: "Caffenet" Layer { Name: "Data" Type: "Data" Top: "Data" Top: "Label" Include { Phase:train } Transform_param { Mirror:true crop_size:227 Mean_file: "Data/ilsvrc12/imagenet_mean.binaryproto" } # mean pixel/channel-wise mean instead of mean image # Transform_param { # crop_size:227 # mean_value:104 # mean_value:117 # mean_value:123 # Mirror:true # } Data_param { Source: "Examples/imagenet/ilsvrc12_train_lmdb" batch_size:256 Backend:lmdb } } Layer { Name: "Data" Type: "Data" Top: "Data" Top: "Label" Include { Phase:test } Transform_param { Mirror:false crop_size:227 Mean_file: "Data/ilsvrc12/imagenet_mean.binaryproto" } # mean pixel/channel-wise mean instead of mean image # Transform_param { # crop_size:227 # mean_value:104 # mean_value:117 # mean_value:123 # Mirror:false # } Data_param { Source: "Examples/imagenet/ilsvrc12_val_lmdb" Batch_size:50 Backend:lmdb } } Layer { Name: "Conv1" Type: "Convolution" Bottom: "Data" Top: "Conv1" param { Lr_mult:1 Decay_mult:1 } param { Lr_mult:2 decay_mult:0 } Convolution_param { num_output:96 Kernel_size:11 Stride:4 Weight_filler { Type: "Gaussian" std:0.01 } Bias_filler { Type: "Constant" value:0 } } } Layer { Name: "RELU1" Type: "ReLU" Bottom: "Conv1" Top: "Conv1" } Layer { Name: "Pool1" Type: "Pooling" Bottom: "Conv1" Top: "Pool1" Pooling_param { Pool:max Kernel_size:3 Stride:2 } } Layer { Name: "Norm1" Type: "LRN" Bottom: "Pool1" Top: "Norm1" Lrn_param { Local_size:5 alpha:0.0001 beta:0.75 } } Layer { Name: "Conv2" Type: "Convolution" Bottom: "Norm1" Top: "Conv2" param { Lr_mult:1 Decay_mult:1 } param { Lr_mult:2 decay_mult:0 } Convolution_param { num_output:256 Pad:2 Kernel_size:5 Group:2 Weight_filler { Type: "Gaussian" std:0.01 } Bias_filler { Type: "Constant" Value:1 } } } Layer { Name: "RELU2" Type: "ReLU" Bottom: "Conv2" Top: "Conv2" } Layer { Name: "Pool2" Type: "Pooling" Bottom: "Conv2" Top: "Pool2" Pooling_param { Pool:max Kernel_size:3 Stride:2 } } Layer { Name: "Norm2" Type: "LRN" Bottom: "Pool2" Top: "Norm2" Lrn_param { Local_size:5 alpha:0.0001 beta:0.75 } } Layer { Name: "Conv3" Type: "Convolution" Bottom: "Norm2" Top: "Conv3" param { Lr_mult:1 Decay_mult:1 } param { Lr_mult:2 decay_mult:0 } Convolution_param { num_output:384 Pad:1 Kernel_size:3 Weight_filler { Type: "Gaussian" std:0.01 } Bias_filler { Type: "Constant" value:0 } } } Layer { Name: "RELU3" Type: "ReLU" Bottom: "Conv3" Top: "Conv3" } Layer { Name: "Conv4" Type: "Convolution" Bottom: "Conv3" Top: "Conv4" param { Lr_mult:1 Decay_mult:1 } param { Lr_mult:2 decay_mult:0 } Convolution_param { num_output:384 Pad:1 Kernel_size:3 Group:2 Weight_filler { Type: "Gaussian" std:0.01 } Bias_filler { Type: "Constant" Value:1 } } } Layer { Name: "Relu4" Type: "ReLU" Bottom: "Conv4" Top: "Conv4" } Layer { Name: "CONV5" Type: "Convolution" Bottom: "Conv4" Top: "CONV5" param { Lr_mult:1 Decay_mult:1 } param { Lr_mult:2 decay_mult:0 } Convolution_param { num_output:256 Pad:1 Kernel_size:3 Group:2 Weight_filler { Type: "Gaussian" std:0.01 } Bias_filler { Type: "Constant" Value:1 } } } Layer { Name: "Relu5" Type: "ReLU" Bottom: "Conv5" Top: "CONV5" } Layer { Name: "Pool5" Type: "Pooling" Bottom: "Conv5" Top: "Pool5" Pooling_param { Pool:max Kernel_size:3 Stride:2 } } Layer { Name: "Fc6" Type: "Innerproduct" Bottom: "Pool5" Top: "Fc6" param { Lr_mult:1 Decay_mult:1 } param { Lr_mult:2 decay_mult:0 } Inner_product_param { num_output:4096 Weight_filler { Type: "Gaussian" std:0.005 } Bias_filler { Type: "Constant" Value:1 } } } Layer { Name: "Relu6" Type: "ReLU" Bottom: "Fc6" Top: "Fc6" } Layer { Name: "DROP6" Type: "Dropout" Bottom: "Fc6" Top: "Fc6" Dropout_param { dropout_ratio:0.5 } } Layer { Name: "Fc7" Type: "Innerproduct" Bottom: "Fc6" Top: "Fc7" param { Lr_mult:1 Decay_mult:1 } param { Lr_mult:2 decay_mult:0 } Inner_product_param { num_output:4096 Weight_filler { Type: "Gaussian" std:0.005 } Bias_filler { Type: "Constant" Value:1 } } } Layer { Name: "Relu7" Type: "ReLU" Bottom: "Fc7" Top: "Fc7" } Layer { |