How to configure the structure of each layer in a caffe
Recently just installed in the computer Caffe, because the neural network has different layer structure, different types of layers have different parameters, all according to the Caffe official website of the documentation to do a simple summary.
1. Vision Layers1.1 convolution layer (convolution)
Type: Convolution
Example
Layers { Name: "Conv1" type:convolution Bottom: "Data" Top: "Conv1" blobs_lr:1 # Learning Rate multiplier for the filters blobs_lr:2 # Learning rate multiplier for the biases weight_decay:1 # we ight decay multiplier for the filters weight_decay:0 # Weight Decay multiplier for the biases Convolution_ param { num_output:96 # Learn filters kernel_size:11 # Each filter is 11x11 stride:4 # step 4 pixels between each filter application Weight_filler { type: ' Gaussian ' # Initialize the filters from a Gaussian std:0.01 # Distribution with Stdev 0.01 (default mean:0) } Bias_filler { type: "Cons Tant "# Initialize the biases to zero (0) value:0}}}
BLOBS_LR: The parameters of the learning rate adjustment, in the above example, the weight learning rate is the same as the learning rate given by the solver in operation, and the bias learning rate is twice times the weight.
Weight_decay:
Important parameters of convolution layer
Required Parameters:
Num_output (c_o): Number of filters
kernel_size (or Kernel_h and Kernel_w): size of the filter
Optional Parameters:
Weight_filler [Default type: ' Constant ' value:0]: initialization method for parameters
Bias_filler: initialization method of biasing
Bias_term [Default true]: Specifies whether biased items are turned on
pad (or Pad_h and pad_w) [default 0]: Specifies how many pixels to add to each side of the input
Stride (or Stride_h and stride_w) [default 1]: Specify the step size of the filter
Group (g) [Default 1]: If g > 1, we restrict the connectivityof each filter to a subset of the input. Specifically, the input and outputchannels is separated into G groups, and the ith output group channels would beonly Conn ected to the ith input group channels.
Changes in size after convolution:
Input: N * c_i * h_i * w_i
Output: N * c_o * h_o * w_o, where h_o = (h_i + 2 * pad_h-kernel_h)/stride_h + 1,w_o is calculated by the same method.
1.2 Pool layer (Pooling)
Type: POOLING
Example
Layers { Name: "Pool1" type:pooling Bottom: "conv1" Top: "Pool1" pooling_param { Pool:max Kernel_size:3 # Pool over a 3x3 region stride:2 # Step-pixels (in the bottom blob) between pooling Regi ONS }}
Important parameters of convolution layer
Required Parameters:
kernel_size (or Kernel_h and Kernel_w): size of the filter
Optional Parameters:
pool [Default MAX]:pooling method, currently has MAX, AVE, and stochastic three ways
pad (or Pad_h and pad_w) [default 0]: Specify how many pixels to add to each pass in the input
Stride (or Stride_h and Stride_w) [DEFAULT1]: Specifying the step size of the filter
Changes in size after pooling:
Input: N * c_i * h_i * w_i
Output: N * c_o * h_o * w_o, where h_o = (h_i + 2 * pad_h-kernel_h)/stride_h + 1,w_o is calculated by the same method.
1.3 Local Response Normalization (LRN)
Type: LRN
The local responsenormalization is the normalization of a local input area (activation A is added with a normalized weight (the denominator portion) to generate a new activation B), in two different forms, one of the input areas being the adjacent channels (cross Channel LRN), and the other is the space region within the same channel (within channel LRN)
Calculation formula: Divide each input by
Optional Parameters:
local_size [Default 5]: The number of neighboring channel for which cross channel LRN needs to be summed; for within channel LRN is the edge length of the space area that needs to be summed
Alpha [Default 1]:Scaling parameters
Beta [default 5]: index
norm_region [Default Across_channels]: which method of LRN is selected Across_channels or Within_channel
2. Loss Layers
Deep learning is driven by loss to minimize output and goals.
2.1 Softmax
Type: Softmax_loss
2.2 Sum-of-squares/euclidean
Type: Euclidean_loss
2.3 Hinge/margin
Type: Hinge_loss
Example:
# L1 Normlayers { name: "Loss" Type:hinge_loss Bottom: "pred" Bottom: "Label"}# L2 normlayers { Name: "Loss" Type:hinge_loss Bottom: "pred" Bottom: "Label" Top: "Loss" Hinge_loss_param { norm:l2 }}
Optional Parameters:
norm [Default L1]: select L1 or L2 norm
Input:
n * C * H * wpredictions
n * 1 * 1 * 1Labels
Output
1 * 1 * 1 * 1Computed Loss
2.4 Sigmoid Cross-entropy
Type: Sigmoid_cross_entropy_loss
2.5 Infogain
Type: Infogain_loss
2.6 Accuracy and Top-k
Type: Accuracy
To calculate the accuracy of the output and the target, this is not a loss, and there is no backward this step.
3. Excitation layer (Activation/neuron Layers)
In general, the excitation layer is the element-wise operation, the input and output of the same size, in general, is a nonlinear function.
3.1 Relu/rectified-linear and Leaky-relu
Type: RELU
Example:
Layers { Name: "RELU1" type:relu Bottom: "conv1" Top: "Conv1"}
Optional Parameters:
Negative_slope [Default 0]: specify output with input value less than zero.
Relu is currently used to do long excitation functions, mainly because it converges faster and can maintain the same effect.
The standard Relu function is max (x, 0), and typically when x > 0 o'clock is output x, but x <= 0 o'clock output negative_slope. The Relu layer supports in-place calculations, which means that the output and input of the bottom are the same to avoid memory consumption.
3.2 Sigmoid
Type: SIGMOID
Example:
Layers { Name: "Encode1neuron" Bottom: "encode1" Top: "Encode1neuron" type:sigmoid}
The SIGMOID layer calculates the output of each input x by SIGMOID (x), such as the function.
3.3 Tanh/hyperbolic Tangent
Type: TANH
Example:
Layers { Name: "Encode1neuron" Bottom: "encode1" Top: "Encode1neuron" type:sigmoid}
The Tanh layer calculates the output of each input x by Tanh (x), such as the function.
3.3 Absolute Value
Type: Absval
Example:
Layers { name: ' Layer ' bottom: ' In ' top: ' Out ' type:absval}
The Absval layer calculates the output of each input x by ABS (x).
3.4 Power
Type: POWER
Example:
Layers { name: ' Layer ' bottom: ' In ' top: ' Out ' type:power power_param { power:1 scale:1 shift:0 }}
Optional Parameters:
Power [Default 1]
scale [Default 1]
Shift [Default 0]
The power layer calculates the output of each input x by using (SHIFT + scale * x) ^ Power.
3.5 BNLL
Type: BNLL
Example:
Layers { name: ' Layer ' bottom: ' In ' top: ' Out ' TYPE:BNLL}
The BNLL (binomial normal log likelihood) layer calculates the output of each input x through log (1 + exp (x)).
4. Data layer (Layers)
Data enters caffe through the data layer, and the data layer is at the bottom of the entire network. Data can come from an efficient database (LevelDB or LMDB), directly from memory. If efficiency is not pursued, data can be read from the hard disk in the form of HDF5 or general images.
4.1 Database
Type: DATA
Required Parameters:
Source: The name of the directory that contains the data
batch_size: number of inputs processed at one time
Optional Parameters:
Rand_skip: Skips this value from the input at the beginning, which is useful when the asynchronous random gradient drops (SGD)
backend [default LEVELDB]: Select Use LEVELDB or LMDB
4.2 in-memory
Type: Memory_data
Required Parameters:
Batch_size, channels, height, Width: Specifies the size of the data read from memory
The memory data layer reads data directly from memory, without copying it. In order to use it, one must call Memorydatalayer::reset (from C + +) or net.set_input_arrays (from Python) in order to spec Ify a source of contiguous data (as 4D row major array), which is read one batch-sized chunk at a time.
4.3 HDF5 Input
Type: Hdf5_data
Necessary parameters:
Source: The file name that needs to be read
Batch_size: Number of inputs processed at one time
4.4 HDF5 Output
Type: Hdf5_output
Necessary parameters:
file_name: The file name of the output
HDF5 's role is different from the other layers in this section, it is to write the input blobs to the hard disk
4.5 Images
Type: Image_data
Necessary parameters:
Source: The name of the text file, each line gives a picture of the file name and label
batch_size: number of pictures in a batch
Optional Parameters:
Rand_skip: Skips this value from the input at the beginning, which is useful when the asynchronous random gradient drops (SGD)
Shuffle [Default false]
new_height, New_width: Resize all the images to this size
4.6 Windows
Type: Window_data
4.7 Dummy
Type: Dummy_data
The Dummy layer is used for development and debugging. Specific parameters dummydataparameter.
5. General level (Common Layers)
5.1 Full-Connection layer inner Product
Type: inner_product
Example:
Layers { Name: "Fc8" type:inner_product blobs_lr:1 # Learning rate multiplier for the filters Blobs_lr:2 # Learning rate multiplier for the biases weight_decay:1 # Weight Decay multiplier for the filter s weight_decay:0 # Weight Decay multiplier for the biases inner_product_param { num_output:1000 Weight_filler { type: ' Gaussian ' std:0.01 } bias_filler { type: ' Constant ' value:0 } } bottom: "fc7" Top: "Fc8"}
Necessary parameters:
Num_output (c_o): Number of filters
Optional Parameters:
Weight_filler [Default type: ' Constant ' value:0]: initialization method for parameters
Bias_filler: initialization method of biasing
Bias_term [Default true]: Specifies whether biased items are turned on
Size changes after the fully connected layer:
Input: N * c_i * h_i * w_i
Output: N * c_o * 1 *
5.2 Splitting
Type: SPLIT
The splitting layer can separate an input blob into multiple output blobs. This is used when you need to input a blob into multiple output layers.
5.3 Flattening
Type: FLATTEN
Flattening is to put an input size of n * c * H * w into a simple vector whose size is n * (c*h*w) * 1 * 1.
5.4 Concatenation
Type: CONCAT
Example:
Layers { Name: "Concat" Bottom: "in1" Bottom: "in2" Top: "Out" type:concat concat_param { concat_dim:1 }}
Optional Parameters:
Concat_dim [Default 1]:0 represents link num,1 for link channels
Size changes after the fully connected layer:
Input: Size of each blob from 1 to k n_i * c_i * H * w
Output:
If Concat_dim = 0: (n_1 + n_2 + ... + n_k) *c_1 * H * w, you need to ensure that all input c_i are the same.
If Concat_dim = 1:n_1 * (c_1 + c_2 + ... +c_k) * H * W, you need to ensure that all input n_i are the same.
Through the concatenation layer, you can link multiple blobs into a blob.
5.5 Slicing
The SLICE layer is a utility layer this slices an input layer to multiple output layers along a given dimension (currently num or channel only) with given slice indices.
5.6 Elementwise Operations
Type: eltwise
5.7 Argmax
Type: Argmax
5.8 Softmax
Type: SOFTMAX
5.9 Mean-variance Normalization
Type: MVN
6. Reference
Caffe
Caffe per layer structure