Caffe itself does not support the input of multiple classes, the framework is mainly used to solve the problem of image classification, and at present, two important issues require multiple-label input: multitasking learning (multi-task) and multiple tag classification (multi-label), this article for these two problems, Multi-label input implemented
At present, the online popular multiple-label input method mainly has the following four kinds:
1. The simplest, the use of mxnet, it itself supports the problem of multiple tag classification, so also since the input of multiple tags
2. With HDF5 + Slice layer method, this method is not difficult to achieve, but when the volume of data is very large, the HDF5 storage mode will produce dozens of times times the image of the hard disk consumption, and the process is very slow, I started the main use of this method, often less than
3. With two data input (two lmdb), one output only picture, one output label, this method is relatively difficult compared to two, but it should be a good result
4. Directly modify the Caffe network so that it satisfies the multiple label input, in order to facilitate later experiments, I realized this method
Method Description: Note that most data conversions from Caffe begin with the./.build_release/tools/convert_imageset method, so starting with convert_iamgeset should be the right choice, By tracking data input, the convert_imageset.cpp, IO.HPP, Io.cpp, DATA_LAYERS.HPP, Caffe.proto, Data_layer.cpp, Image_data_, and so on are modified in turn. Layer.cpp, Memory_data_layer.cpp and so on. Because this is the engineering need, so I directly in the PY-FASTER-RCNN Caffe on the modification
Main method: (All pictures on the blog, the left is modified, the right is the original)
1. Modify Convert_imageset:lines is read information, including picture path and label, here to vector to support multiple-label input
2. Modify IO.HPP: The following figure, the main is a variety of label to vector
3. Modify Io.cpp: The main modification of readimagetodatum and readfiletodatum two functions, mainly set_label to be set to go in
4. Modify Caffe.proto, the main need to meet the multiple label input, as well as add some input network layer parameters
5. Modify Data_layer.cpp, realize data This type of network layer of multiple label input, mainly modify datalayersetup and Load_batch two functions
6. Modify DATA_LAYER.HPP, mainly to modify the parameters of some network, increase the number of labels and other variables
7. Modify Image_data_layer.cpp
8. Modify Memory_data_layer.cpp
At this point, all the modifications are completed and tested after compilation:
From the experimental results, we can see that the input label is consistent with the train.txt.
Summary: In this paper, by modifying the internal code of Caffe, the Caffe input is realized, which mainly realizes the DataLayer, Imagedatalayer and memorydatalayer three kinds of input layers, but it needs attention, I have only been tested under DataLayer and Imagedatalayer, and not tested on Memorydatalayer or other input types
Finally, thanks to the laboratory of the great God Brother Lxionghao, in the implementation process, I mainly through the continuous compilation of positioning errors to modify and learn from his modified methods to complete the task.
The following posted his blog, compared with the value of reference:
Explanation: http://blog.csdn.net/baobei0112/article/details/47606559
Project: Https://gitcafe.com/lxiongh/Caffe_for_Multi-label
My works will be uploaded to GitHub later and then announced to you.