Caffe your own data training and testing

Last Update:2015-07-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the routines provided by Caffe, such as Mnist and Cifar10, the preparation of datasets is done by calling code themselves, and for the ImageNet1000 class database, for the university laboratory, often facing the embarrassment of insufficient computer memory. For the application, it is more important to train and test the data sets that are suitable for their own conditions in Caffe. So it is necessary for us to do our own database and train and test on Caffe.
1, data preparation
New folder in data myself, we intercepted two classes of ImageNet1000 class-panda and Sea_horse, training panda Picture 24, Test panda picture 6, training sea_horse Picture 38, test Sea_ Picture of Horse 7 photos. ：

The inputs for training and testing are described in Train.txt and val.txt, which list all the files and their labels. Note that in the imagenet1000 class, the names of the classes we classify are ASCII, that is, 0-999, the corresponding classification names and numbers are mapped in synset_words.txt (Own Write).
Run the following command:

-name*.|-d‘/‘-f2-3> train.txt

Note the path
Then, because your database has a smaller number of samples, you can manually do the classification label yourself. After each photo of Train.txt, use 1-2 to classify.

When there are too many samples, write your own instruction batch processing.
Similarly, get val.txt. MATLAB (under Windows) bulk processing code is as follows:

% BatchWrite inchTxtclear ALLCLCfile=dir (' F:\animal\sea_horse '); Temp=length (file);file=file(3: temp); Fp=fopen (' F:\animal\animal.txt ',' at '); %' at 'OpenorCreatefile  forReading andWriting Append Data to End  of file%' WT 'Discard existing contents forn=1: Length (file) fprintf (' iter=%d\n ', N) txt=[file(n). Name' 2 ' ' \ n ']; fprintf (Fp,txt);EndFclose (FP);

Test.txt cannot be labeled, all set to 0.
We also need to turn the size of the image into 256x256, which provides the following command:

forindo256$name$namedone

But I did not succeed, I had to use MATLAB (under the Windows) code to deal with:

for  n=1 : length  (file ) temp= Imread ([' F:\animal\panda\ ' file  (n).     Name ]);    Temp=imresize (Temp,2 );    Temp=imresize (Temp,[256  256 ]); Imwrite (temp,[' F:\animal\panda\ ' file  (n).  Name ]); end

Then create a new myself folder in Caffe-master/examples, and then Caffe-maester/examples/imagenet Create_ imagenet.sh Copy to this folder, change its name to create_animal.sh, modify the training and test path settings, and run the Sh.

Last get Myself_train_lmdb and Myself_val_lmdb:

2 calculate image mean
model we need to subtract the mean from each picture, So we have to get the average of the training, implemented with the
Tools/compute_image_mean.cpp, which is a good example of how to manipulate multiple builds, such as the protocol buffers, Leveldbs, logins, and so on. We also copied Caffe-maester/examples/imagenet's
./make_imagenet_mean to Examples/myself, renaming it make_animal_mean.sh, To modify the path.

3 definition of the network
Copy all files in caffe-master/models/bvlc_reference_caffenet to caffe-master/examples/ Myself folder, modify the Train_val.prototxt, and note the path to the data layer.

If you look closely at Train_val.prototext, you will find that they are basically the same except for different data sources and the last layer. In training, we use a softmax--loss layer to calculate the loss function and initialize the reverse propagation, while in the validation we used the precision layer to detect our accuracy.

We also have a running protocol solver.prototxt, copy it over, change the first line path to our path net: "Examples/myself/train_val.prototxt", from the inside can be observed, we will run 256 batches, Iterative 4.5 million times (90 period), every 1000 iterations, we test the Learning Network validation data, we set the initial learning rate of 0.01, every 100000 (20) iterations to reduce the learning rate, display a message, training Weight_ Decay is 0.0005, every 10,000 iterations, we show the current state.
The above is the tutorial, in fact, the above takes a long time, so we change a little bit
test_iter:1000 refers to the test batch, we have 10 photos, set 10 on it.
test_interval:1000 refers to every 1000 iterations of testing, and we change it to 500 tests at a time.
base_lr:0.01 is the basic learning rate, because the data volume is small, 0.01 will fall too fast, so change to 0.001
Lr_policy: "Step" learning rate change
gamma:0.1 learning rate Change rate
Stepsize: 100000 reduction in learning rate per 100,000 iterations
display:20
max_iter:450000 maximum iterations per 20 layers,
momentum:0.9 learning parameters without changes
Weight_decay : 0.0005 learning parameters, do not change the
snapshot:10000 10,000 times per iteration of the display state, this is changed to 2000 times
Solver_mode:gpu The end of the line, representing the GPU for

4 Training
Copy the train_caffenet.sh in the caffe-master/examples/imagenet and modify the path named train_myself.sh.

Of course, there are only two categories, the correct rate is still quite high, for example, when iterating to 2000 times, the correct rate is 0.924, that is, 13 var samples only 1 predictions are wrong.

5 Recovering data
Copy the resume_training.sh from the caffe-master/examples/imagenet and run it.
We use instructions./Can.

Reference Material: Learning Note 3 training and testing with own data "caffenet" 2014.7.22 Shikayu
Caffe Official website ImageNet Tutorial

Caffe your own data training and testing

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More