Caffe--deep Learning in Practice deep learning practice _caffe

Source: Internet
Author: User

As a result of work handover needs, the Caffe use method and the overall structure are described clearly. In view of the students have asked me related content, decided to write a simple tutorial in this article, convenient for everyone to reference.
This article simply tells a few things: what Caffe can do. Why Choose Caffe? Environment integral structure Protocol buffer training Basic Flow python training Debug

What Caffe can do. The definition of network structure Training network C++/cuda write structure Cmd/python/matlab interface Cpu/gpu Working mode gives some reference models &pretrain weights

Why Choose Caffe? Modular to do simple: modify the structure without this code open source: Common maintenance of open source code

Environment:

$ lsb_release-a
Distributor Id:ubuntu
Description:ubuntu 12.04.4 LTS
release:12.04
Codename:precise

$ cat/proc/version
Linux version 3.2.0-29-generic (buildd@allspice) (gcc version 4.6.3 (Ubuntu/linaro 4.6.3-1ubuntu5)) #46-ubuntu SMP Fri Ju L-17:03:23 UTC 2012

Vim + taglist + cscope

Overall structure:

Definition Caffe for Caffe and directory, Caffe core code are under $caffe/src/caffe, mainly in the following sections: NET, blob, layer, solver.

Net.cpp:
NET definition network, the entire network contains a lot of layers, net.cpp is responsible for the calculation of the entire network in the training of the forward, backward process, that is, Forward/backward layer.

Layers
Layer in $caffe/src/caffe/layers, define message type in Protobuffer (. proto file,. prototxt or. binaryproto file to define the value of message) Contains property name, type (Data/conv/pool ...) when called in. ), connection structure (input blobs and output BLOBs), layer-specific parameters (such as conv size of the kernel layer). Defining a layer requires defining its setup, forward, and backward processes.

Blob.cpp:
NET data and derivation results are passed through 4-D blob. A layer has many blobs, e.g, data,weight blob size of number * channels * Height * Width, such as 256*3*224*224, to conv layer, weight blob size to Output Number of nodes * Input node * Height * Width, such as alexnet the first conv layer of BLOB size is 3 x x 11; For the inner product layer, the weight blob size is 1 * 1 * OUTPU T node number * input node number; bias BLOB size 1 * 1 * 1 * Output node number (conv layer and inner product layer, also have weight and bias, so in the network structure definition we will see two blobs_lr, the first is Wei Ghts, the second one is bias. Similarly, there are two Weight_decay, one is weight, the other is bias);


BLOBs, Mutable_cpu/gpu_data () and Cpu/gpu_data () are used to manage Memory,cpu/gpu_diff () and Mutable_cpu/gpu_diff () to compute the derivation results.

Slover.cpp:
Combined with loss, update weights with gradient. Main functions:
Init (),
Solve (),
Computeupdatevalue (),
Snapshot (), restore (),//Snapshot (copy) and restore network state
Test ();

In Solver.cpp, there are 3 solver, or 3 classes: Adagradsolver, Sgdsolver and Nesterovsolver are available to choose from.

About loss, can have several loss at the same time, can add regularization (L1/L2);

Protocol Buffer:

The above has been passed, protocol buffer defines the message type in the. proto file, the value of the message in the. prototxt or. binaryproto file;

Caffe
All caffe of the message are defined in $caffe/src/caffe/proto/caffe.proto.

Experiment
In the experiment, the main use of two protocol buffer:solver and model, respectively, define the Solver parameters (learning rate of what) and model structure (network structure).

Tip: Freeze a layer does not participate in training: set its blobs_lr=0 for the image, read the data as far as possible not to use Hdf5layer (because can only save float32 and float64, can not use uint8, so too fee space)

Training basic Process: Data processing
Method one, converted into Caffe accepted formats: Lmdb, LEVELDB, HDF5/. MAT, List of images, etc. Law two, write your own data reading layer (e.g. https://github.com/tnarihi/ tnarihi-caffe-helper/blob/master/python/caffe_helper/layers/data_layers.py) defines network configuration Solver parameter training: such as Caffe Train- Solver Solver.prototxt-gpu 0



Training in Python:
Document & examples:https://github.com/bvlc/caffe/pull/1733

Core code: $CAFFE/python/caffe/_caffe.cpp
Define BLOB, Layer, Net, Solver class $CAFFE/python/caffe/pycaffe.py
NET class Enhancements

Debug: Set Debug: = 1 in Make.config solver.prototxt debug_info:true in Python/matlab view forward & Changes of weights after backward round

Classical Literature:
[Decaf] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf:a deep convolutional activation feature for generic visual recognition. ICML, 2014.
[R-CNN] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, 2014.
[Zeiler-fergus visualizing] M. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. ECCV, 2014.
[Lenet] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based Learning applied to document recognition. IEEE, 1998.
[Alexnet] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS, 2012.
[Overfeat] P. sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. overfeat:integrated recognition, localization and detection using convolutional networks. ICLR, 2014.
[Image-style (Transfer Learning)] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, H. Winnemoeller. Recognizing Image Style. BMVC, 2014.
[KARPATHY14] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-fei. Large-scale video classification with convolutional neural networks. CVPR, 2014.
[Sutskever13] I. Sutskever. Training recurrent neural Networks. PhD Thesis, University of Toronto, 2013.
[CHOPRA05] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application-face verification. CVPR, 2005.


from:http://blog.csdn.net/abcjennifer/article/details/46424949

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.