Caffe of Deep Learning (i) using C + + interface to extract features and classify them with SVM
Reprint please dms contact Bo Master, do not reprint without consent.
Recently because of the teacher's request to touch a little depth of learning and caffe things, one task is to use the ResNet network to extract the characteristics of the dataset and then use SVM to classify. As a just contact with deep learning and caffe and programming ability is very weak small white, really is all kinds of Meng. Some blogs have been borrowed, and the following will be posted.
The directory is as follows: ready to work the model and the network with the Caffe provided by the C + + interface extraction features will be presented features converted to MATLAB. MAT format is classified with SVM (LIBSVM) ready-made models and networks for the preparation of work
This feature is based on the RESNET50 layer of the network (URL link https://github.com/KaimingHe/deep-residual-networks) and already trained models (model links https:// onedrive.live.com/?authkey=%21aafw2-fvoxevrck&id=4006cbb8476ff777%2117887&cid=4006cbb8476ff777).
Microsoft's network disk does not seem to be on the host inside add a path to what, here on their own Baidu bar.
to get the network is deploy.prototxt, in order to use it to extract the training set and test set characteristics, need to precede with data layer, respectively, to do two. Prototxt for features.
extracting features from C + + interface provided by Caffe
Under the Caffe path, there is a readme under the./examples/feature_extraction/folder, giving an example of the use of feature extraction:
./build/tools/extract_features.bin Models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel Examples/_ Temp/imagenet_val.prototxt fc7 examples/_temp/features Leveldb
' FC7 ' is the layer that extracts the feature, is also the highest layer of the sample model, and can be extracted from other layers.
' Leveldb ' is a format for storing features, and I use the Lmdb format for this experiment.
' Examples/_temp/features ' is the path to the storage feature, note that if you save the Lmdb format, the * * path must not exist in advance, or it will be an error.
' 10 ' is batch number, and batch size multiplied to extract the total picture of the feature, otherwise it seems to repeat the previous picture or how, there are bloggers wrote, did not try. Batch size can not be set too large, or it will out of the memory, this according to their own machine adjustment on OK.
' Imagenet_val.prototxt ' is a feature of the network, SVM needs training set and test set characteristics, so we want to separate, to use two networks.
Extract the Lmdb type will get this folder
This is the inside.
This step is done. convert the proposed feature into the. mat format of Matlab
This is the way to draw on this blog http://m.blog.csdn.net/article/details?id=48180331, thank the blogger.
Refer to the document Https://lmdb.readthedocs.org/en/release for http://www.cnblogs.com/platero/p/3967208.html and Lmdb, read the Lmdb file, Then convert to Mat file, and then use MATLAB to call mat for visualization.
install Caffe's Python dependency library and convert Lmdb to mat using the following two secondary files.
./feat_helper_pb2.py
# generated by the protocol buffer compiler.
Do not edit! From GOOGLE.PROTOBUF import descriptor to google.protobuf Import from Google.protobuf import reflection from Goo GLE.PROTOBUF Import DESCRIPTOR_PB2 # @ @protoc_insertion_point (Imports) descriptor = Descriptor. FileDescriptor (name= ' Datum.proto ', package= ' feat_extract ', serialized_pb= ' \n\x0b\x64\x61tum.proto\x12\x0c\x66\ ') X65\x61t_extract\ "i\n\x05\x44\x61tum\x12\x10\n\x08\x63hannels\x18\x01 \x01 (\x05\x12\x0e\n\x06height\x18\x02 \x01 (\x05\x12\r\n\x05width\x18\x03 \x01 (\x05\x12\x0c\n\x04\x64\x61ta\x18\x04 \x01 (\x0c\x12\r\n\x05label\x18\x05 \x01 ( \x05\x12\x12\n\nfloat_data\x18\x06 \x03 (\x02 ') _datum = descriptor. Descriptor (name= ' Datum ', Full_name= ' feat_extract. Datum ', Filename=none, File=descriptor, Containing_type=none, fields=[descriptor. Fielddescriptor (name= ' channels ', full_name= ' feat_extract. Datum.channels ', index=0, Number=1, type=5, cpp_type=1, label=1, has_default_value=falsE, default_value=0, Message_type=none, Enum_type=none, Containing_type=none, Is_extension=false, Extension_sco Pe=none, Options=none), descriptor. Fielddescriptor (name= ' height ', full_name= ' feat_extract.
Datum.height ', Index=1, number=2, type=5, cpp_type=1, label=1, Has_default_value=false, Default_value=0, Message_type=none, Enum_type=none, Containing_type=none, Is_extension=false, Extension_scope=none, Options=No NE), descriptor. Fielddescriptor (name= ' width ', full_name= ' feat_extract.
Datum.width ', index=2, number=3, type=5, cpp_type=1, label=1, Has_default_value=false, Default_value=0, Message_type=none, Enum_type=none, Containing_type=none, Is_extension=false, Extension_scope=none, Options=Non e), descriptor. Fielddescriptor (name= ' data ', Full_name= ' feat_extract.
Datum.data ', index=3, number=4, type=12, Cpp_type=9, label=1, Has_default_value=false, default_value= "",Message_type=none, Enum_type=none, Containing_type=none, Is_extension=false, Extension_scope=none, Options=Non e), descriptor. Fielddescriptor (name= ' label ', Full_name= ' feat_extract.
Datum.label ', index=4, number=5, type=5, cpp_type=1, label=1, Has_default_value=false, Default_value=0, Message_type=none, Enum_type=none, Containing_type=none, Is_extension=false, Extension_scope=none, Options=Non e), descriptor. Fielddescriptor (name= ' float_data ', Full_name= ' feat_extract.
Datum.float_data ', index=5, number=6, type=2, cpp_type=6, label=3, Has_default_value=false, default_value=[], Message_type=none, Enum_type=none, Containing_type=none, Is_extension=false, Extension_scope=none, Optio Ns=none),], extensions=[], nested_types=[], enum_types=[], Options=none, Is_extendable=false, Extensi On_ranges=[], serialized_start=29, serialized_end=134,) descriptor.message_types_by_name[' Datum ' = _datum class DATUM (message. Message): __metaclass__ = Reflection. Generatedprotocolmessagetype descriptor = _datum # @ @protoc_insertion_point (class_scope:feat_extract. Datum) # @ @protoc_insertion_point (Module_scope)
./lmdb2mat.py
Import lmdb Import FEAT_HELPER_PB2 import numpy as NP import Scipy.io as Sio import time def main (argv): Lmdb_name =
SYS.ARGV[1] print '%s '% sys.argv[1] batch_num = Int (sys.argv[2]);
batch_size = Int (sys.argv[3]);
Window_num = batch_num*batch_size;
Start = Time.time () If ' db ' not in locals (). Keys (): db = Lmdb.open (lmdb_name) txn= Db.begin () cursor = Txn.cursor () cursor.iternext () datum = Feat_helper_pb2.
Datum () keys = [] values = [] for key, value in enumerate (Cursor.iternext_nodup ()): Keys.append (Key) Values.append (Cursor.value ()) ft = Np.zeros ((window_num, int (sys.argv[4))) for Im_i DX in range (window_num): Datum.
Parsefromstring (Values[im_idx]) ft[im_idx,:] = datum.float_data print ' time 1:%f '% (Time.time ()-start)
Sio.savemat (Sys.argv[5], {' Feats ': ft}) print ' Time 2:%f '% (Time.time ()-start) print ' done! ' If __name__ = = ' __main__ ': Import sys main (SYS.ARGV)
The first two documents are not changed, are posted directly to the. py file, and then run as follows. SH is OK.
#!/usr/bin/env sh
lmdb=./examples/_temp/features_fc7 # lmdb file path
batchnum=1
batchsize=10
# dim= 290400 # feature length, Conv1
# dim=43264 # conv5 dim=4096 out=./examples/_temp/features_fc7.mat
#mat文件保存路径
python./lmdb2mat.py $LMDB $BATCHNUM $BATCHSIZE $DIM $OUT
' Batchnum ' and ' batchsize ' are the number and size of the batch when the feature is raised.
' DIM ' is the dimension of the feature, this is calculated by itself, you can use the command to view a layer of network data parameters, and then the first one is batchsize, the remaining few (usually left one or three) multiply on the OK. using SVM (LIBSVM) to classify
Finally came to use SVM to do classification, but the time is limited I still can not learn to use SVM to do Multilabel classification, so I can only separate for each label classification accuracy and then take the average, this may not be too scientific.
Nus-wide is 81 concept so calculate 81 precision, the code is posted below.
Clear CLC Addpath D:\dpTask\NUS-WIDE\NUS-WIDE-Lite trainlabels=importdata (' Trainlabels_lite.mat ');
Testlabels=importdata (' Testlabels_lite.mat ');
Trfeatures = ImportData (' Train.mat '); Trfeatures_sparse = sparse (trfeatures);
% features must is in a sparse matrix Tefeatures = ImportData (' Test.mat '); Tefeatures_sparse = sparse (tefeatures);
% features must is sparse matrix for la=1:81% gets LIBSVM format Data fprintf (' iter=%d,processing data ... \ n ', LA);
Tic
Trlabel = Trainlabels (:, LA);
Telabel = Testlabels (:, LA);
Libsvmwrite (' SVMtrain.txt ', Trlabel, trfeatures_sparse);
Libsvmwrite (' SVMtest.txt ', Telabel, tefeatures_sparse);
ToC
% use LIBSVM for training fprintf (' ITER=%D,LIBSVM training ... \ n ', LA);
Tic
[Train_label,train_feature] = Libsvmread (' SVMtrain.txt ');
Model = Svmtrain (train_label,train_feature, '-H 0 ');
fprintf (' Model got!\n '); %model = Svmtrain (Train_label,train_feature, '-C 2.0-g 0.00048828125 '); This is with the parameter [test_label,test_feature] = Libsvmread (' SVMtest.txt ');
[Predict_label,accur,score] = Svmpredict (Test_label,test_feature,model);
fprintf (' prediction done!\n ');
ToC
Label (:, LA) = Predict_label;
Accuracy (LA) = Accur (1);
fprintf (' iter=%d,acurracy=%d\n ', la,accur (1));
name=sprintf (' Result/accuracy.txt ');
FID = fopen (name, ' at ');
fprintf (FID, ' no.%d\t%f\n ', la,accur (1));
Fclose (FID); End
Direct use of the default parameters are not optimized, there is a need for small partners can grid.py, or directly easy.py run the whole experiment.
can refer to the information:
[1] http://m.blog.csdn.net/article/details?id=48180331
[2] Http://www.cnblogs.com/jeffwilson/p/5122495.html?utm_source=itdadao&utm_medium=referral
This is the use of MATLAB interface to extract features
[3] Http://www.cnblogs.com/denny402/p/5686257.html
This one has code to view the parameters of each layer.