This part of the content from the CDA in-depth learning combat classroom, taught by Tang Yudi if you attempt to use the CPU to train the model, then you are crazy ... In the training model, the most time-consuming factor is the size of the image, the general 227*227 with the CPU to train, training 10,000 times may be more than 1 weeks time. Different network structure, there may be different picture size needs, so before training need to understand, in the generation of Lmdb link directly conforms to the data requirements of the model. If you DIY the framework, then do not know how to test the framework and the general framework comparison, whether high-quality, you can go to benchmarks site, with others PK: Http://human-pose.mpi-inf.mpg.de/Caffe official website: Examples: mainly focus on the training model Notebook examples: Focus on fun-tuning model
first, training file configuration details
1, parameter file Solver.prototxt
Taking Caffenet as an example, the parameters are interpreted as follows:
NET: "/caffe/examples/lmdb_test/train/bvlc_reference_caffenet/train_val.prototxt"
# Training Prototxt where, the path
test_iter:1000
# Test How many batch Test_iter * batchsize (Test set) = Test set size
test_interval:500
# every 500 iterations, Testing with test sets
base_lr:0.01
# Set initialization learning rate for 0.01
lr_policy: "Step"
# Weight attenuation strategy.
gamma:0.1
stepsize:100000
# The initial learning rate was 0.01, and the learning rate dropped every 100,000 iterations
display:20
Every 20 times epoch shows some data information
max_iter:50000
# iteration number
momentum:0.9
# has always been 0.9, fixed, the data of the iteration is faster, the pace is faster
weight_decay:0.0005
# Weight attenuation factor is 0.0005
snapshot:10000
# in every 10,000 iterations, a snapshot of the current state is generated
Snapshot_prefix: " /caffe/examples/lmdb_test/train/bvlc_reference_caffenet "
# Model Snapshot save
solver_mode:cpu
# can be set to GPU or CPU
The big purpose of the snapshot: if something happens to interrupt the training, it's going to collapse, so the snapshot stores the middle result of the training, which is really human and can be recovered from the snapshot when it is trained again. Directly in the final execution file, call already trained snapshots on line, with-snapshot
2, frame files, train_val.prototxt
What is the specific meaning of each layer of the frame file. Refer to Caffe official link: http://caffe.berkeleyvision.org/tutorial/layers.html
Training files: Configure the training phase of the picture data set, configure the training phase of the label dataset, configure the test phase of the picture dataset, the configuration test phase of the label dataset, the multiple tag loss function (blog: Caffe Lmdb interface to achieve multiple tag data preparation and training)
Network configuration file--Defining the network
Name: "" #Write casually
Layer
{
name: "" #Name
Type: "" #very strict
Top: "Label" # with the final content, the last full connection layer will appear. Bottom: "Label"
}
transform.param
# 1/256, normalized, how many sizes generally need to be normalized to deal with the
batch_size: 64
#How many samples to iterate at a time
Layer
{Data}
# Two data layers, one training layer, one verification layer
Layer
{
Conv1
}
It is to be noted that:
1. The final full connection output, when you are a few categories, you must fill out a few:
The number of multiple classifications depends on the number of classes during the training.
Inner_product_param {
num_output:2
weight_filler {
type: ' Gaussian '
std:0.01
2. When you define a layer
Need to write a different layer of their own, C + + write yourself, very troublesome
3. Picture size
Picture size is defined according to the network, large networks are 227*227 (224*224), Vgg,alex, such as lenet Small network can be used 28*28
4. The role of batch
Batch The bigger the better, the general 64. Small words, you can better show, step Iteration
3, model execution file train.sh
The execution file is the file you want to run under Linux after you've trained.
./build/tools/caffe train \
# Caffe This tool, usually under tool
-gpu 0 \
# does not have to, whether with the GPU, for example, you have a lot of GPU, each piece has a number, Then you can directly select a piece of GPU. If you have four GPU, then you can-gpu all
-model path/to/trainval.prototxt \
# does not have to, because there is a Solver parameter file behind it, The parameter file contains trainval.prototxt files, generally do not have to write
-solver path/to/solver.prototxt \
# must, prototxt file content in which
-weights path /to/pretrained_weights.caffemodel
# does not have to,-weight is used to do fine-tuning, the parameter takes to learn. This is fine-tuning time to use
--snapshot=examples/imagenet/myself/model/caffenet_train_1000.solverstate
# If it's broken, This is the time to continue training with snapshots ~ Just add the snapshot path to the execution file
Which snapshot is a big kill device, I think there are two uses: 1, temporary downtime ... Machine Training Interrupted ...
Training is to follow the snapshot every 10,000 times to generate a snapshot, if the shutdown can continue to extend the last content to continue training, and then write again this file. 2, to fine-tuning other people's models, you need to down their model snapshots, and then continue to train, continue training can be reduced to a very small learning rate, the whole connecting layer can be slightly differential.
4. Verification Set File: Deploy.prototxt
This file is expected to be used, similar to the parameter Trian_val.prototxt training file.
Train_val.prototxt file = Data input + convolution layer + full join layer +loss/accuracy
Deploy.prototxt file = Simplified version data input + convolution layer + full join Layer +prob prediction layer
Other really don't have to change. The Train_val imagedata input layer needs to be modified to the input layer.
Take alexnet, the difference is in the Data Entry section + last link layer deploy.prototxt data Entry section:
Layer {
Name: "Data"
type: "Input" Top
: "Data"
Input_param {shape: {dim:10 dim:3 dim:227}}
}
The meaning of the parameters in Input_param is:
* * First: **dim, the number of data augmentation for identifying sample images, one image will become 10, then input to the network for recognition. If data augmentation is not done, it can be set to 1.
Second: the number of channels in the picture, the general grayscale picture is a single channel, the value is 1, if not grayscale 3 channel picture is 3.
Third: the height of the picture, the unit pixel.
Fourth: the width of the picture, the unit pixel.
Content from: Caffe generate lenet-5 deploy.prototxt file Deploy.prototxt full connection Prob part:
Layer {
name: ' Prob '
type: ' Softmax '
bottom: ' Fc8 ' top
: ' Prob '
}
The output is the probability value, the parameter Trian_val.prototxt in the collection after the full connection layer is loss/accuracy. It can be seen that the main content of training set requirements is output loss/accuracy, measuring training accuracy, and validation set file, mainly is the image classification output.
second. How to test new data after training
Official case Links IPYNB format: Open link
1, how to transfer Mean.binaryproto Mean.npy
Since validation is required under Python, when validating a new picture, read it first and subtract the mean, which means that the mean value is a version that Python can understand.
There are blog summary of two methods (blog: Caffe mean file Mean.binaryproto to Mean.npy): Mean.binaryproto to Mean.npy, known mean to create with the mean value
(1) Mean.binaryproto transformation
When using the Caffe C + + interface, the required image mean file is in PB format, for example, the common mean file name is Mean.binaryproto, but the desired image mean file is numpy format, such as Mean.npy, when working with the Python interface. So when you're working across languages, you need to convert Mean.binaryproto to Mean.npy, which translates into the following code:
Import Caffe
import NumPy as np
Mean_proto_path = ' Mean.binaryproto ' # The PB format image mean file path to be converted
Mean_npy_path = ' Mean.npy ' # converted NumPy format image mean file path
blob = Caffe.proto.caffe_pb2. Blobproto () # creates protobuf blob
data = open (Mean_proto_path, ' RB '). Read () # reads Mean.binaryproto file contents
Blob. Parsefromstring (data) # resolves file contents to blob
array = Np.array (Caffe.io.blobproto_to_array (BLOB)) # Converts the mean value in a blob to numpy format, array shape (Mean_number,channel, hight, width)
mean_npy = array[0] # There can be multiple sets of mean values in an array, so it is necessary to select one of the
Np.save (Mean_npy_path, Mean_npy) by subscript.
(2) Known image mean value, construct mean.npy
If the average value of each channel in an image is known, for example, the average value of each channel for a 3-channel image is 104,117,123, we can also construct mean.npy. The code is as follows:
Import NumPy as np
Mean_npy_path = ' mean.npy '
MEAN = Np.ones ([3,256, 256], dtype=np.float)
mean[0,:,:] = 104< C3/>mean[1,:,:] = 117
mean[2,:,:] = 123
np.save (mean_npy, mean)
(3) How to load mean.npy files
Above we construct the mean file Mean.npy in two ways, and the code to load the mean.npy when used is as follows:
Import NumPy as np
mean_npy = np.load (mean_npy_path)
mean = Mean_npy.mean (1). Mean (1)
2, using Python to make predictions
(1) Module loading and setting environment
#Load module and image parameter settings
import NumPy as NP
import Matplotlib.pyplot as Plt
plt.rcparams ['figure.figsize'] = (10, 10) c3 /> # large images
plt.rcparams ['image.interpolation' = 'nearest' # don 't interpolate: show square pixels
plt.rcparams ['image.cmap' = 'Gray'
#Modelpathdeploy
import Caffe
import OS
caffe.set_mode_cpu ()
Model_def = caffe_root + 'examples / facedetech / deploy.prototxt'
model_weights = caffe_root + 'examples / faceDetech / Alexnet_iter_50000_full_conv.caffemodel'
#Model loading
net = caffe.net (Model_def, # defines the structure of The model
model_weights, # contains the trained weights
Caffe. TEST)
How do you not have a good model of training so Caffe official set, using imagenet pictures and caffenet model to train a Caffemodel, for everyone to download. To classify a picture, this Caffemodel is the best. Download the address: Http://dl.caffe.berkeleyvision.org/bvlc_reference_caffenet.caffemodel
or the command line downloads:
# sudo./scripts/download_model_binary.py models/bvlc_reference_caffenet
(2) Model preprocessing phase-the case of not dealing with the mean value
Transformer = Caffe.io.Transformer ({' Data ': net.blobs[' data '].data.shape})
# So reshape operation is to automatically shrink
the validation picture Transformer.set_transpose (' Data ', (2,0,1)) # Move image channels to outermost dimension
# transpose turns RGB to BGR, All to do transpose
# BGR who put in front, for example 3*300*100, here set 3 in front of
transformer.set_raw_scale (' data ', 255) # Rescale from [0, 1 to [0, 255]
# pixel Rescale operations, focus the data pixel points in [0,255] interval
transformer.set_channel_swap (' Data ', (2,1,0))
# CPU Classification
net.blobs[' data '].reshape ( # Batch Size
3, # 3-channel (BGR) images
(3) Danzhang image processing and recognition
Image = Caffe.io.load_image ("/caffe/data/trainlmdb/val/test_female/image_00010.jpg")
# import Picture
Transformed_ Image = Transformer.preprocess (' data ', image)
# preprocessing picture
output = Net.forward ()
# Forward propagation once, find parameters
net.blobs[ ' Data '].data[...] = transformed_image
output_prob = output[' prob '][0]
# output probability
print ' predicted class is: ', Output_prob.argmax ()
# output Maximum probability
A two classification result of the author's training is:
Array ([0.34624347, 0.65375656], Dtype=float32)
reprint: Three, multiple training cycle read
This section mainly refers to the blog: Caffe Learning Series (20): Use of trained Caffemodel to classify
Examples/cpp-classification/in the Caffe root directory
Under the folder, there is a classification.cpp file, which is used to classify. Of course, after compiling, put in the/build/examples/cpp_classification/
Below
The data is ready, we can start to classify, we provide you with two versions of the classification method:
First, C + + method
In the Caffe root directory under the examples/cpp-classification/folder, there is a classification.cpp file, is used to classify. Of course, after compiling, put it under the/build/examples/cpp_classification/.
We'll run the command directly:
# sudo./build/examples/cpp_classification/classification.bin \
models/bvlc_reference_caffenet/ Deploy.prototxt \
models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel \
data/ilsvrc12/ Imagenet_mean.binaryproto \
data/ilsvrc12/synset_words.txt \
examples/images/cat.jpg
The command is very long and uses a lot of symbols to wrap the line. As you can see, from the second line is the argument, one for each line, a total of 4 parameters
After successful operation, output top-5 result:
----------prediction for Examples/images/cat.jpg----------
0.3134-"n02123045 tabby, tabby cat"
0.2380-"n021 23159 Tiger Cat "
0.1235-" n02124075 Egyptian cat "
0.1003-" n02119022 Red fox, Vulpes vulpes "
0.0715-" n0212 7052 Lynx, Catamount "
That is, 0.3134 of the probability of tabby cat, there are 0.2380 of the probability of Tiger cat ...
second, the Python method
The Python interface can use Jupyter notebook for visualization, so this method is recommended.
I don't have to visualize it here, write a py file named py-classify.py
#coding = utf-8 #Load necessary libraries Import numpy as NP import sys, os #Set current directory caffe_root = '/ home / xxx / caffe /' sys.path.insert (0, Caffe _root + 'python') import Caffe Os.chdir (caffe_root) Net_file = caffe_root + 'models / bvlc_reference_caffenet / Deploy.prototxt' Caffe_model = caffe_root + 'models / bvlc_reference_caffenet / bvlc_reference_caffenet.caffemodel' mean _file = caffe_root + 'python / cavn_net_image = caffe.net (Net_file, caffe_model, caffe. TEST) transformer = Caffe.io.Transformer (('Data': net.blobs ['data'] .data.shape}) transformer.set_transpose ('Data'), (
2,0,1)) Transformer.set_mean ('Data', Np.load (mean_file). Mean (1). Mean (1)) Transformer.set_raw_scale ('data', 255)
Transformer.set_channel_swap ('Data', (2,1,0)) im = caffe.io.load_image (caffe_root + 'examples / images / cat.jpg') net.blobs ['data'] .data [...] = transformer.preprocess ('data', IM) out = Net.forward () Imagenet_labels_filename = Caffe_ Root + 'data / ilsvrc12 / synset_words.txt' labels = np.loadtxt (imagenet_labels_filename, str, DELimiter = '\ t') Top_k = net.blobs ['prob'] .data [0] .flatten (). Argsort () [-1: -6: -1] for I in Np.arange (top_k.size): print T Op_k [i] , Labels [top_k [i]]
Execute this file, output:
281 n02123045 Tabby, tabby cat
282 n02123159 Tiger Cat
285 n02124075 Egyptian cat
277 n02119022 Red fox, Vulpe S vulpes
287 n02127052 Lynx, Catamount
The Caffe development team actually wrote a python version of the taxonomy file, the path is python/classify.py
Run this file must have two parameters, one input picture file, one output result file. And the run must be in the Python directory. Assuming the current directory is the Caffe root directory, run:
# cd Python
# sudo python classify.py. /examples/images/cat.jpg Result.npy
The results of the classification are saved as result.npy files in the current directory and are invisible. And this file has errors, when run, will prompt
Mean shape incompatible with input shape
of errors. Therefore, to use this file, we have to make modifications:
1, modify the mean value calculation:
Locate to
mean = Np.load (args.mean_file)
In this line, add a line below:
Mean=mean.mean (1). Mean (1)
You can solve the problem of error.
2, modify the file, so that the results are displayed at the command line:
Locate to
# classify.
Start = Time.time ()
predictions = classifier.predict (inputs, not args.center_only)
print ("Do in%.2f S."% (Tim E.time ()-start)
This place, add a few lines to the back, as follows:
# classify.
Start = Time.time ()
predictions = classifier.predict (inputs, not args.center_only)
print ("Do in%.2f S."% (Tim E.time ()-start)
imagenet_labels_filename = '. /data/ilsvrc12/synset_words.txt '
labels = np.loadtxt (imagenet_labels_filename, str, delimiter= ' t ')
TOP_ K = Predictions.flatten (). Argsort () [ -1:-6:-1]
for i in Np.arange (top_k.size):
print top_k[i], labels[top_k[ I]]
Just kind of it. The operation does not cause an error, and the results are displayed below the command line.
. extension One: Visualization of network structure in Caffe
Another is the interface under Python, where draw_net.py can represent patterns in a graphical way based on the. prototxt file, and the model diagram at the beginning of the blog is painted with the interface
./python/draw_net.py./examples/siamese/mnist_siamese.prototxt /examples/siamese/mnist_siamese.png.
Use this interface to draw an example of a network
The first parameter is the model file, and the second parameter is the saved address of the model diagram being plotted. Reference Blog: Caffe basic operations and analysis using step by Step:caffe framework