Caffe basic operations and analysis using the step by Step:caffe framework

Source: Internet
Author: User

Although Caffe has been installed for nearly one months, but Caffe use progress is relatively slow, sure enough, as Mr. Liu said, set up Caffe framework environment is relatively simple, but the complete data preparation----model training-----------------------a long process is required, In this process you need to have a deep understanding of a lot of things in the Caffe, so that you can know why you have such a result and know how to adjust in training or fine-tuning. The following is explained for use in Caffe.

In the use of the process, the Caffe website provides detailed instructions for use, if the feeling still have some difficulties, you can use Google or Baidu to search for their own problems and want to learn the process of searching.

First, the basic composition of Caffe model

To train a caffe model, you need to configure two files, including two parts: network model, parameter configuration, corresponding ***.prototxt, ****_solver.prototxt file

The Caffe model file explains:

    1. LEVELDB Construction of preprocessing image
      Input: Batch of images and labels (2 and 3)
      Output: Leveldb (4)
      The following information is included in the instruction:
      1. Conver_imageset (to build leveldb running program)
      2. train/(jpg or other format image in this directory)
      3. Label.txt (image file name and its label information)
      4. The name of the output Leveldb folder
      5. CPU/GPU (Specifies whether to run code on the CPU or on the GPU)
    2. CNN Network configuration file

      1. Imagenet_solver.prototxt (file with configuration for global parameters)
      2. Imagenet.prototxt (Files that contain the configuration of the Training network)
      3. Imagenet_val.prototxt (contains configuration file for test network)

Network model: that defines each layer of your network, is a Siamese model that is drawn with/python/draw_net.py in Caffe, very clear

Layer contains: (Take lenet as an example)

data: This includes both the training data and the test data layer types. Generally refers to the input layer, which contains the source: Data path, the batch data size Batch_size,scale represents the data representation in [0,1],0.00390625 1/255

Training Data layer:

Layer {name:"Mnist"type: "data" top: "data" top: "label" include {Phase:train} TR Ansform_param {scale: 0.00390625 } data_param {source: "examples/mnist/mnist_train_lmdb" batch_size:  Backend:lmdb}}            

Test Data layer:

Layer {name:"mnist"Type:"Data"Top:"Data"Top:"label"include {phase:test} transform_param {scale:0.00390625} data_param {Source:"Examples/mnist/mnist_test_lmdb"batch_size: -Backend:lmdb}}

convoluation: Convolution layer, blobs_lr:1, blobs_lr:2 respectively, the learning rate of weight and bias update, Here the weight of the learning rate for the Solver.prototxt file defined in the study of frankness, bias learning rate is really the weight of learning rate of twice times, so generally will get a good convergence rate.

Num_output represents the number of filters, kernelsize represents the size of the filter, stride represents the step, weight_filter the type of filtering

Layer {name:"CONV1"Type:"convolution"Bottom:"Data"Top:"CONV1"param {lr_mult:1//weight Learning rate} param {lr_mult:2 //bias study rate, generally twice times of weight } convolution_param {num_output: //filter number kernel_size:5Stride:1 //step Weight_filler {type:"Xavier"} bias_filler {type:"constant"    }  }}

POOLING: Pool layer

Layer {   "pool1"  "Pooling"    " CONV1 " "Pool1"  pooling_param {    pool:max            2 }     }

inner_product: Actually means full connection, don't be misled by name

Layer {name:"ip1"Type:"innerproduct"Bottom:"pool2"Top:"ip1"param {lr_mult:1} param {lr_mult:2} inner_product_param {num_output: -Weight_filler {type:"Xavier"} bias_filler {type:"constant"    }  }}

RELU: Activation function, nonlinear change layer max (0, X), usually paired with convolution layer

Layer {   "relu1"  "ReLU"    " ip1 "  "ip1"}   

SOFTMAX:

Layer {  "loss"  "softmaxwithloss"  "ip2" "label         "" Loss "}  

Parameter configuration file:

The _solver.prototxt file defines some of the parameters that need to be used during the training process, comparing learning rates, weight attenuation coefficients, iterations, using GPUs or CPUs, etc.

# The Train/Test NET protocol buffer definitionnet:"Examples/mnist/lenet_train_test.prototxt"# TEST_ITER Specifies how many forward passes the test should carry out. # in the Caseof MNIST, we have test batch size -and -Test iterations,# covering the fullTen, theTesting Images.test_iter: -# Carry outTesting every -Training Iterations.test_interval: -# theBaselearning rate, momentum and the weight decay of the NETWORK.BASE_LR:0.01Momentum:0.9Weight_decay:0.0005# The learning rate Policylr_policy:"INV"Gamma:0.0001Power:0.75# Display every -Iterationsdisplay: -# The maximum number of Iterationsmax_iter:10000# Snapshot Intermediate resultssnapshot: theSnapshot_prefix:"examples/mnist/lenet"# Solver Mode:cpu or gpusolver_mode:gpudevice_id:0 #在cmdcaffe接口下, the GPU sequence number starts at 0, and if there is a GPU, the device_id:0

The trained model is saved as ***.caffemodel and is available for later use.

Second, the use of Caffe training model consists of the following several steps:

    1. Preparing data
    2. Rebuild the Lmdb/leveldb file, Caffe supports three data format inputs: Images, Levelda, Lmdb
    3. Define Name.prototxt, Name_solver.prototxt file
    4. Training model

Iii. more useful and basic interfaces in Caffe (Cmdcaffe)

Note: When using Cmdcaffe, you need to switch to the Caffe_root folder by default

1, training model, take mnist as an example

./build/tools/caffe Train--solver=examples/mnist/lenet_solver.prototxt

Note: the example of Caffe online cannot be executed directly, it is necessary to use the above command to use the Caffe interface under Tools, because Caffe will need to execute the file from the root directory by default.

2. Observe the running time of each stage can be used

./build/tools/caffe Time--model=models/bvlc_reference_caffenet/train_val.prototxt

3. Extracting features using existing models

10

CONV5 is the feature that extracts the fifth convolution layer, Examples/_temp/feaures represents the directory where the results are stored (the directory needs to be built in advance)

4, the existing model to find-tuning, for example, we now have a class 1000 classification model, but at present we need only 20 classes, at this time we do not have to retrain a model, just to change the last layer into 20 class of Softmax layer, Then use the existing data to fine-tuning the original model

In many cases, when learning a deep learning model using the Caffe framework, it is difficult to get a fine-tuing fit model from the imagenet or other large datasets, and the best thing to do is to fine-tuning on the already trained model, Through these processes can deepen their own deep learning, as well as the caffe use of knowledge and familiarity, in order to facilitate themselves in the follow-up of their models, their own model training and fine-tuning process.

The Caffe model that has been trained can be downloaded in Git's Caffe project, and the classic models are: Alexnet.caffemodel, Lenet.caffemodel, Rcnn.caffemodel, others can be downloaded on the GIT official website of Caffe.

Use your own datasets to fine-tuning the already trained models (using the Cmdcaffe interface):

0

First parameter: Select a good Caffe module

Train: Select Train function

followed by the specific parameters, respectively, configuration command, configuration file path, fine-tuning command, fine-tuning dependent base model file directory, the choice of training mode: GPU or CPU, using the CPU can default do not write

Note: The fine-tuning process is similar to the training process, except that the commands are different when invoking the Caffe interface, so you still need to prepare the data for the training process before fine-tuning.

Download data, build trainset and testset-> build db-> set the path->fine-tuning.

5, there is a python below the interface, draw_net.py can be based on the. prototxt file to show the mode as a graphical method, the model diagram of the beginning of the blog is drawn with the interface

./python/draw_net.py./examples/siamese/mnist_siamese.prototxt   ./examples/siamese/mnist_siamese.png

Use this interface to make a sample of the network drawing

The first parameter is the model file, the second parameter is the saved address of the plotted model diagram

The role of batch_size in deep learning:

In the course of deep learning training, there are two kinds of training methods, one is consciousness batch, the other is stochastic training method.

Solver: Use the forward and backward interfaces to update the parameters and iterate over the loss (defined optimization method with stochastic gradient descent,sgd;adaptive gradient, Nag and scaffolding)

Solver function: (Specify optimization method)

1. The network can be gradually optimized to create a trained network, and evaluation of the test network;

2. Iterative optimization of network parameters by calling forward and backward;

3. Periodically update the network;

4. Record the middle process of network training and record the status during the optimization process

Reference Blog: Caffe first knowledge, uncover the veil

Caffe basic operations and analysis using the step by Step:caffe framework

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.