Due to the need for work handover. The Caffe usage and the general structure description should be described clearly.
In view of the students have asked me related content, decided to write a simple tutorial in this article, convenient for everyone to participate in the test.
This article simply says a few things:
- What can Caffe do?
- Why Choose Caffe?
- Environment
- Overall structure
- Protocol Buffer
- Training Basic Process
- Training in Python
- Debug
What can Caffe do?
- Defining the network structure
- Training Network
- Structure of the C++/cuda writing
- Cmd/python/matlab interface
- CPU/GPU working mode
- Gave some reference models &pretrain the weights
Why Choose Caffe?
- Modularity is done well
- Simple: Change the structure without the code
- Open Source: Co-maintenance of open source code
Environment:
$ lsb_release-a
Distributor Id:ubuntu
Description:ubuntu 12.04.4 LTS
release:12.04
Codename:precise
$ cat/proc/version
Linux version 3.2.0-29-generic ([email protected]) (GCC version 4.6.3 (Ubuntu/linaro 4.6.3-1ubuntu5)) #46-ubuntu SMP Fri Jul 17:03:23 UTC 2012
Vim + Taglist + cscope
Overall structure:
Define CAFFE as CAFFE folder, CAFFE Core code is under $caffe/src/caffe, mainly has the following parts: NET, blob, layer, solver.
net.cpp:
NET defines the network. The whole network contains a lot of layers, net.cpp is responsible for calculating the entire network in the training of forward, backward process, that is, calculate forward/backward when the layer gradient.
-
Layers :
the layer in $caffe/src/caffe/layers. The Protobuffer (. Proto file defines the value of message in the. prototxt or. binaryproto file) is called when the property name is included, type (Data/conv/pool ... )。 Connection structure (input blobs and output BLOBs), layer-specific parameters (such as the conv size of the kernel layer). Defining a layer requires defining its setup, forward, and backward processes.
-
blob.cpp : Data and derivative results in
Net are passed through a 4-D blob. A layer has very many blobs, e.g,
- to data,weight blob size is number * Channels * Height * Width, such as 256*3*224*224.
- to the conv layer. Weight BLOB size is the number of Output nodes * Input nodes * Height * Width, such as alexnet the first conv layer has a blob size of 3 x 11.
- for the inner product layer, the weight blob size is 1 * 1 * The number of OUTPUT nodes * input node, bias blob size is 1 * 1 * 1 * Output node number (conv layer and inner Product layer. There are also weight and bias, so in the definition of network structure we will see two blobs_lr, the first one is weights. The second one is bias. In a similar way. Weight_decay also has two, one is weight, the other is bias);
in the blob. Mutable_cpu/gpu_data () and Cpu/gpu_data () are used to manage memory. Cpu/gpu_diff () and Mutable_cpu/gpu_diff () are used to calculate the derivative results.
slover.cpp:
Combined with loss. Update weights with gradient.
Main functions:
Init (),
Solve (),
Computeupdatevalue (),
Snapshot (), restore (),//Snapshot (copy) and restore network state
Test ().
There are 3 solver in the Solver.cpp. There are 3 classes: Adagradsolver, Sgdsolver and Nesterovsolver.
About loss. can have multiple loss at the same time. Able to add regularization (L1/L2);
Protocol Buffer:
The above has been. Protocol buffer defines the message type in the. proto file, the value of the message in the. prototxt or. binaryproto file;
Caffe
All of CAFFE's message is defined in $caffe/src/caffe/proto/caffe.proto.
Experiment
In the experiment, the main use of two protocol buffer:solver and model, respectively defined solver (learning rate what) and model structure (network structure).
Skills:
- Freeze a layer of non-participation and training: set its blobs_lr=0
- For images. Read data try not to use Hdf5layer (because only can save float32 and float64, can not use uint8, so too much space)
Training Basic Process:
- Data processing
French one, converted to Caffe accepted format: Lmdb, LEVELDB, HDF5/. MAT, List of images, etc.; Law two. Write your own data read layer (e.g. https://github.com/tnarihi/tnarihi-caffe-helper/blob/master/python/caffe_helper/layers/data_layers.py)
- Defining the network structure
- Configuring Solver Parameters
- Training: such as Caffe Train-solver Solver.prototxt-gpu 0
Training in Python:
Document & examples:https://github.com/bvlc/caffe/pull/1733
Core code:
- $CAFFE/python/caffe/_caffe.cpp
Define BLOB, Layer, Net, Solver class
- $CAFFE/python/caffe/pycaffe.py
NET classes for enhanced functionality
Debug:
- Set debug in Make.config: = 1
- Set the debug_info:true in Solver.prototxt
- Python/matlab forward & Backward after a round of weights changes
Classical Literature:
[DeCAF] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf:a deep convolutional activation feature for generic visual recognition. ICML, 2014.
[R-CNN] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, 2014.
[Zeiler-fergus visualizing] M. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. ECCV, 2014.
[LeNet] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based Learning applied to document recognition. IEEE, 1998.
[AlexNet] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS, 2012.
[Overfeat] P. sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. overfeat:integrated recognition, localization and detection using convolutional networks. ICLR, 2014.
[Image-style (Transfer Learning)] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, H. winnemoeller. Recognizing Image Style. BMVC, 2014.
[KARPATHY14] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-fei. Large-scale video classification with convolutional neural networks. CVPR, 2014.
[Sutskever13] I. Sutskever. Training Recurrent neural Networks. PhD Thesis, University of Toronto, 2013.
[CHOPRA05] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. CVPR, 2005.
Caffe--deep Learning in practice