1. Installation
Mac Install Caffe can refer to a previous wiki (install Caffe under Mac), of course, if you encounter other problems, please Google.
For a variety of Linux systems, there are already a lot of tutorials on the web.
2.caffe code and Architecture level Brief introduction
Caffe source is CPP language, based on some external libraries, including Blas (matrix calculation), CUDA (GPU-driven), Gflags,glog,boost,protobuf,hdf5,leveldb,lmdb and so on.
As long as each has been installed, compile the time to modify the Caffe Makefile.config (path and the modification of the compilation options), you can compile the entire project.
Caffe Code folders include:
Build all compiled file storage locations
Data folder
Docs Tutorials and Notes folder (recommended for good reading, some of the content is very detailed)
Include include folder, header file
Examples a variety of demo folders, related applications can refer to or directly use the corresponding demo and configuration
Mnist handwritten Chinese character recognition cifar10 scene recognition imagenet image classification Cpp_classification Classification of CPP interface file feature_extraction feature Demo folder
MATLAB matlab corresponds to the interface
The interface of Python python
Models model file path, some training good model can refer to Caffe official website model zoo:http://caffe.berkeleyvision.org/model_zoo.html
Tools Some tool
SRC all source code storage location
The files in the docs/tutorial are well worth reading, with a thorough understanding of the architecture and basic use of Caffe, entry-required
Some applications of 3.caffe
(1) Network training parameter adjustment, can refer to mnist or Cifar10 in the demo.
The parameters are based on experience, or reference docs/tutorial/solver.md
Parameters to be adjusted (Solver.prototxt) mainly include:
BASE_LR: Initial learning rate, this is a very important parameter; Momentum: generally set to 0.9, if BASE_LR particularly low can also be set to 0.99 or 0.999, etc. weight_decay: Default 0.005, can be adjusted appropriately, similar to the regularization item;
Lr_policy: Learning rate change strategy, common have fixed (fixed), inv,step and so on, detailed explanation see http://stackoverflow.com/questions/30033096/ What-is-lr-policy-in-caffe
or refer to the Getlerningrate function in the source code src/caffe/solver.cpp
Common parameters include: Examples/mnist in various ways fixed,inv,step, or cifar10 in the quick two steps of the method.
The main control is the initial learning rate, and corresponding to adjust the learning Rate strategy, batch size needs to be properly controlled size
Other parameters:
Test_iter: How many rounds of tests display: How many times does the iteration show Max_iter: Maximum iterations
Snapshot: How many rounds to save a result snapshot_prefix: Result file prefix Solver_mode: Using CPU or GPU
(2) Fine-tune according to the known network
The advantage of fine-tune is that it can be used for its own application by adjusting the network structure and parameters properly by others.
Note: 0.fine-tune network and the previous network structure should be basically consistent, especially input, or the corresponding layer of the number of different parameters, will be the error
1. Join--weight When training specifies which model to import the corresponding layer's parameters (by name).
2. Because Caffe is based on the name of the corresponding layer to find whether to fill the corresponding parameters, so for you need to import the parameters of the layer, the name and the original network remain unchanged, for you need to adjust the layer, the name needs to be modified.
3. If you modify the structure of the network (for example, from 1000 categories to 28), the corresponding layer of the name needs to be changed, or it will be an error.
4. For you to adjust the layer, learning rate multiples lr_mult need to be larger (10,20), the other layer as small as possible, if the other layer does not need to adjust, can be set to 0.
(3) Use CPP to do classification
Using a trained network to classify is also a very common application.
Code in the Examples/cpp_classification/classification.cpp executable file in Build/examples/cpp_classification/classification.bin
Classification.cpp file by reading into the configuration, weights, mean files, pictures, a forward operation to get the probability of the output layer, and then give the classification results.
The basic functions are predict and classify. Several functions were added based on specific requirements, and the main function was rewritten to read the image through the file list, and then the feature was written to the file.
Examples/cpp_classification/extract_feature.cpp: After rewriting the classification.cpp including classification and batch feature feature
Normalize: The auxiliary function of L2-norm normalization
Featureandpredict: Simultaneously extracts the feature (referring to the passing feature variable) and returns the result of the classification, and extracts the corresponding layer characteristics according to the incoming string.
Featurebylayername: A function that extracts a particular layer of features from an incoming string.
Main: All parameters are specified in main, while the Pull feature function is invoked to extract feature from the picture in the file and write to the binary file.
(4) Feature extraction
Using an existing network to extract features of a particular layer, you can use examples/feature_extraction or rewrite the Extract_feature.cpp function with the above (3).
4. Some of the pits and solutions encountered during OCR
(1) The first adjustment network will always find a variety of inexplicable errors, such as lock,state and so on; the main reason may be a few:
Error in file or file format, configuration file format error: Solver file or Network file
(2) Basic familiar with Caffe can be on the basis of a specific network to adjust, at this time the common problem is not convergence or the correct rate is not ideal to the situation.
General Solution Idea:
Check whether Lmdb or leveldb generates errors, and whether the configuration file is written incorrectly
To observe the change of loss, adjust the learning rate and strategy appropriately;
Think about the relationship between training data quantity, category number, network structure layer, consider whether the resolution data and network ability match each other, if it is not appropriate, you can consider changing the network or increasing the amount of data.
(3) If the network convergence, the correct rate is also relatively high situation, still may encounter problems
For example, the accuracy of the classification is correct, but each output is not 0 is the probability of 1, this problem is mostly due to softmax before the data problem, the solution:
1 Change the network, increase the amount of data, or in the existing better network based on fine-tuning
2 Get Softmax data before doing Softmax transformation: xi=exp (Xi-max)/sum (exp (xi-max)), reduce the maximum is to prevent the value overflow, the specific code see examples/cpp_classification/ The classification.cpp predict function.
The Classification.cpp function is running for a very long time, and the single image takes a few seconds.
The reason is that the classification function will be imported every time the configuration, the time is very long, as long as you manually import a configuration, and then the classification operation can be resolved. Code See EXAMPLES/CPP_CLASSIFICATION/EXTRACT_FEATURE.CPP
If all the normal and larger network uptime may be in the 0.x second level, smaller will be in 0.0x seconds or 0.00x seconds, but many times such as the sliding window of the potential classification target too much, running speed still can not meet the normal requirements
You can refer to the method in Examples/extract _feature for improvement, batch read the picture as a batch, do a forward operation, the code to be completed
(4) Solve the 0,1 probability problem, directly used in sliding window, found that the correct rate is still very low
The main reason for this problem is that the confidence probability given by the classifier is often not very reliable, although the classification accuracy ratio is high.
The solution is to use the classifier to give the classification, and then use the prototype prototype method to give the confidence probability, after this improvement, Cnn+svm+prototype Verification Code recognition rate can exceed the traditional method.