Recently participated in a recognized competition, the project involved in a number of categories, originally intended to a large category training a classification model, but this will be more troublesome, for the same image classification will be repeated calculation of the classification network convolutional layer, waste computing time and efficiency. Later found that multi-tasking learning in deep learning can achieve multi-label classification, all categories only need to train a classification model on the line, the different attributes of the category is shared convolution layer. All of my project development is based on the Caffe framework, the default, the data layer in the Caffe only supports single-dimension tags, does not support multi-label classification. I also refer to Daniel's blog modified caffe inside the source code, so that Caffe support multi-label classification. The following describes how to modify the source code in caffe to support multiple tags, including training and modification of the test process.
Caffe Source modification:
needs to modify Convert_imageset.cpp in Caffe to support multiple tags, convert_ Imageset.cpp is under the tools file under the root directory of Caffe. I have directly downloaded the modified convert_imageset.cpp to replace my original convert_imageset.cpp. Then you need to recompile Caffe, enter the Caffe directory, enter the command:
make clean
make–j4
makes Pycaffe
OK, if there is no error Caffe can support multi-label classification, and then is to train the network model based on its own data and the number of multi-label categories.
Note: Based on a lot of people looking for me to convert_imageset.cpp, I uploaded it:
http://download.csdn.net/detail/xjz18298268521/9776275
Need to be able to download their own. The
changes the code as follows:
Std::ifstream infile (argv[2]);
STD::VECTOR<STD::p air<std::string, std::vector<float>> > lines;
std::string filename;
std::string label_count_string = argv[5];
int label_count = Std::atoi (Label_count_string.c_str ());
Std::vector<float> label (Label_count); while (infile >> filename) {for (int i = 0; i < label_count;i++) infile >> label[i
];
Lines.push_back (Std::make_pair (filename, label));
}//Create new DB scoped_ptr<db::D b> db_image (Db::getdb (flags_backend));
SCOPED_PTR<DB::D b> Db_label (Db::getdb (flags_backend));
Db_image->open (Argv[3], db::new);
Db_label->open (Argv[4], db::new);
Scoped_ptr<db::transaction> Txn_image (Db_image->newtransaction ()); Scoped_ptr<db::transaction> Txn_label (Db_label->newtransaction ());
Training Model:
Above, we have a basic part of the multi-tasking deep learning data entry. In order to be compatible with the Caffe framework, I also refer to Daniel's blog, and discard some of the open source implementation to increase the data Layer label dimension options and modify the data layer code, directly using the two layers to read, read into the data and multidimensional tags. The next step is to introduce the steps and modifications required for training.
1. Lmdb Data Production
Due to the length of the reason, I only posted part of the main code map, note the red part of the icon, the first is the number of categories you need to tag the numbers, the second is some data path.
Because now in order to support the multi-label, the data and labels are separated, the previous single label in the data layer and tag together the corresponding (their own understanding). So the third and fourth first red are the Lmdb data and the corresponding multidimensional tags that were last used for training in test and train. This makes the Lmdb script file I will put on my blog resources:
http://download.csdn.net/detail/xjz18298268521/9708564
You can download and modify it according to your own needs, after executing the script file, the corresponding four Lmdb data files will be generated under the relative path. To here Lmdb data production finished, the subsequent mean file production and the original is the same.
2. Modify the Training network model Train_val.prototxt
#Training data layer
name: "CaffeNet"
layer {
name: "data"
type: "Data"
top: "data" #The original is a two-layer top
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "/home/xjz/multiple-caffe/caffe-master/examples/multiple-lable/caffenet/mean.binaryproto"
}
data_param {
source: "/ home / xjz / multiple-caffe / caffe-master / examples / multiple-lable / caffenet / ten_classes_train_lmdb"
batch_size: 128
backend: LMDB
}
}
#Training data label layer
layer {
name: "data"
type: "Data"
top: "label"
include {
phase: TRAIN
}
data_param {
source: "/ home / xjz / multiple-caffe / caffe-master / examples / multiple-lable / caffenet / ten_classes_train_label_lmdb"
batch_size: 128
backend: LMDB
}
}
#Test data layer
layer {
name: "data"
type: "Data"
top: "data"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 227
mean_file: "/home/xjz/multiple-caffe/caffe-master/examples/multiple-lable/caffenet/mean.binaryproto"
}
data_param {
source: "/ home / xjz / multiple-caffe / caffe-master / examples / multiple-lable / caffenet / ten_classes_val_lmdb"
batch_size: 100
backend: LMDB
}
}
#Test data label layer
layer {
name: "data"
type: "Data"
top: "label"
include {
phase: TEST
}
data_param {
source: "/ home / xjz / multiple-caffe / caffe-master / examples / multiple-lable / caffenet / ten_classes_val_label_lmdb"
batch_size: 100
backend: LMDB
}
}
After modifying the data layer of the network model, we need to slice the contents of the label database, split the label of each attribute, and add the slice layer, the slice layer is cut into multiple output layers according to the dimension given by the cutting indicator (now only num and channel), as shown in the figure below. There are several categories of labels that define several categories of top and are named differently for connecting to the final accuracy layer. For slice layer parameters: Slice_dim: Target dimension, 0 for Num and 1for channel, general select 1; slice_point: Specifies the index of the selected dimension (the number of indexes must be equal to the number of BLOBs minus one), I am 4, so minus one is 3.
3. The design of the last loss function
Before the single label, only need to design a loss function, now is multi-label classification needs to design multiple loss function layer, so that each major category corresponding to a loss function layer, the following figure is a category of loss function layer and the corresponding test layer:
For the two bottom in the accuracy layer: the first one needs to connect to the corresponding full-connection layer, the second one needs to be connected to the label layer of the front with the slice layer cut.
For the softmaxwithloss layer of the two bottom: the first need to connect the corresponding full-connection layer, the second need to connect the front with slice layer cutting corresponding label layer. Loss_weight: Need to fill in the loss value of the Lost function in the final total loss function value of the weight of the value, general, it is recommended that the weight value of all tasks added to 1, if this value is not set, may lead to network convergence instability, because of multi-task learning in the gradient of different tasks to accumulate, Causing the gradient to be too large, or even triggering a parameter overflow error, causes network training to fail.
Here is another problem, that is, the previous single-label, the last layer of the full-attached layer of the FC8 layer of the out_num is fixed, the size is based on the number of categories of single-label classification, and now the multi-label of each label category attribute size is not the same. So here you add an all-connected layer before each label's loss function, and the corresponding output out_num size equals the number of the corresponding label's category. All the added layers are connected to the original second fully connected layer, that is, the FC7 layer, as shown in the following figure, to the training of the basic are ready to complete, the following training steps and the original single-label training is basically the same, the next can be trained.
Testing Process
Modify the Deploy.prototxt file, the previous network layer is not required to modify, only need to modify the corresponding last layer of the full join layer and loss function layer, modify the way and the previous training Trian_ Val.prototxt is the same, each tag category needs a layer of its own fully connected layer and loss function, as shown in the following figure: