Disclaimer: The Caffe series is an internal learning document written by our lab Huangjiabin god, who has been granted permission to do So.
This reference is made under the Ubuntu14.04 version, and the required environment for the default Caffe is already configured, and the following teaches you how to build the kaiming He residual network (residual network).
Cite:he K, Zhang X, Ren S, et al residual learning for image recognition[c]//proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778. Cited by 1330.
1.ResNet structure Introduction:
The structure of the ResNet is as Follows:
Fig 1 34-layer ResNet
Displays the network principal frame, and you can see that a residual module (Fig 2) is made up of a two-layer convolution plus an identity map. The size of the feature map between the same color blocks is the same, so the dimension size of the input and output of the residuals module is the same, and can be added directly (such as the real curve in Fig. 1) when the network extends to different color blocks with twice times the sampling or stride=2 convolution, then The size of the feature map is halved, but the number of convolution cores increases by one time, in order to maintain the complexity of the times, then what should be done when the input and output sizes of the residuals module are different? In this paper, the B method is used: the convolution of 1x1 is mapped to the same dimension as the output (imaginary curve in Fig 1). The general structure of the ResNet is still the reference to the Vgg Network.
Fig 2 Residual module
This reference is to build the ResNet of the CIFAR10 experiment in this paper, a total of 20 layers. The structure is as follows:
layer_name |
output_size |
20- Layer ResNet |
Conv1 |
X + |
Kernel_size=3 X 3 Num_output = + Stride = 1 Pad = 1 |
conv2_x |
x + |
{3X3,16; 3x3,16} x 3 |
conv3_x |
x + |
{3x3,16; 3x3,16} x 3 |
conv4_x |
8 x 8 |
{3x3 , 16; 3x3,16} x 3 |
innerproduct |
1 X 1 |
Average pooling 10-d FC |
Each convx_x contains 3 residual modules, each of which has a 3x3 size, and the pad is 1,stride 1. The output of the con4_x is mapped to 64 1x1-sized feature maps via global_average_pooling, and the results are then output by a fully connected layer containing 10 neurons.
2. Data Preparation
CIFAR10 Database Introduction:
The image size in the CIFAR10 database is 3 x x 32 (channel number × image height X image width), training data is 50000, and the test data is 10000. There is also the CIFAR100, which is a 100-class image database.
Fig 3 CIFAR10 Database
We first set up a folder under The/home/your_name/called ResNet, and then put Caffe-master in the decompression, and then follow the steps below
Fig 4 put Caffe-maste R into ResNet and unzip
Step 1:
Locate Makefile.config.example in Caffe-master (root directory of Caffe) and copy one for Makefile.config.
Fig 5 Copying a copy of the Makefile.config
Set some parameters in this copy-good makefile.config file. Because each computer environment is different, the setup parameters of this protocol are posted here for reference only: (simply remove the "#" comment)
With_python_layer: = 1 becomes with_python_layer: = 1
Step 2:
Open the terminal in Caffe-master (root directory of caffe), enter make clean (clear the previously compiled file, although it has not been compiled), then enter make all (recompile caffe), and finally enter make Pycaffe (this Reference will then use Python to build the residual network, so you'll generate an interface file for Python calls).
Fig 6 make clean & make all & make Pycaffe
Step 3:
In the Caffe sample program, there is a CIFAR10 demo, which has access to the CIFAR10 data Program. Open the terminal in Caffe-master (root directory of Caffe) and Enter:
$./data/cifar10/get_cifar10.sh
Why to enter this in the root directory of caffe, because only in the root directory has the data file, to follow the path to find the file, after the command will appear the download interface, as Follows:
Fig 7 Download CIFAR10 interface
After the download is complete, a lot of data batch is generated in the Caffe root/data/cifar10/, but these are binary files and we need to convert to LMDB format.
Fig 8 binary data batch after download
Step 4:
Also open the terminal in Caffe-master (root directory of Caffe) and Enter:
$./examples/cifar10/create_cifar10.sh
This translates the binary file above into LMDB data and generates the mean file Mean.binaryproto of the training data (cifar10_train_lmdb). The mean file is calculated by calculating the mean value of each sample in different dimensions, taking CIFAR10 as an example, the training data is 3 x x 32 (sample number × image channel X image height x image width), then the dimension of the mean file is 3 x x 32.
Fig 9 generated Test data, training data, and mean files
So we've generated the data, and then we're going to use Python to build the Network.
Caffe: Construction of ResNet's residual network structure and data preparation from scratch