Caffe Study Notes (v) run your own JPG data with Caffe

Source: Internet
Author: User

1 Collect your own data 1-1 the source of my training set and test set: Emoticons Pack

Because a picture of the online download is very troublesome, so I simply downloaded two EIF emoticons package. The image in the same expression bag has a strong similarity, so it can be used as a kind of image. Download a EIF Decompression package can extract the EIF file into GIF and JPG files, and then delete the GIF file, leaving only jpg format files, these pictures are my training set and test set.

1-2 using Rename to rename images in batches

(1) For a folder root containing image src.jpg, create a new test.txt file in root, write "rename src.jpg dst.jpg" In it and change the txt suffix to bat, You can rename the src.jpg to dst.jpg by double-clicking Test.bat.

(2) Using the method mentioned in the previous blog, all the filenames under the root folder are stored in a TXT file, so you can easily rename all the images under the root folder.

(3) give an example. This is the case before renaming the root:

Refer to the contents of the Caffe study Note (iv), get a TXT file with all the filenames under the root folder, find and replace a little bit, and get the TXT file as follows:

The suffix is changed to bat and then run, resulting in the result as shown. Not all successful, it is estimated because the file name of these chaotic symbols are not recognized as part of the Test.bat.

There is no way, you can only change the above symbols neatly. Select the above image, rename the first file to "Biao.jpg", that is, the simplest kind of unified renaming method, you will see the following situation:

But here the name has the space also has the parentheses, I feared will have the influence to the Caffe training, so still had to use the Test.bat to turn the file name all to "Biaox.jpg".

It should be suggested here that when a file name contains a space, it is not possible to use the "rename Src.jpg dst.jpg" method just to rename it successfully. The solution is to enclose the src.jpg with spaces in double quotes in the Test.bat file, namely:

rename “src.jpg” dst.jpg

As shown, enclose the src.jpg with spaces in double quotation marks.

And this time it all worked out.

(4) Just get the first set of JPG library "expression Cat", now we get the second set of JPG library "real cat". The Test.bat file created above is useful, you can use the Find and replace function to change the "Biao" inside the "Zhen", and then paste into the "Real Cat" folder:

You can then rename another set of JPG databases in an instant:

1-3 Making Train.txt and Val.txt

It should be noted that the two TXT file should be exactly how to fill in, and the train, Val folder is how to put the image of the one by one corresponding up.

In my case, there are two subfolders in my Train folder, one called "Zhen" (with 50 real cats inside) and one called "Biao" (with 50 smileys in it):

The corresponding train.txt also indicate which sub-folder The image is located in.

In the same vein, my Val folder does not have sub-folders, but directly inside there are 20 test charts:

Therefore, the corresponding val.txt does not have to indicate subfolders.

In fact, Train.txt and Val.txt store the "relative path +label" of each image.

2 Convert training set and test set to LEVELDB format

Before you read this section, please refer to the Caffe study Note (iv) to convert your own data into Lmdb format, this blog will be on its basis on the last three difficult problems to propose a solution, and generate the LEVELDB format files.

Question one: Why did the last blog end up with the generated Lmdb file to train will fail?

If you open the folder of the Lmdb file can be found that the Lmdb file generated by the above method is only 8KB so small (in fact, whether Lmdb or LEVELDB is the package of the picture set, Lmdb or leveldb file size should be the training set, test set size almost), This means that the "build Lmdb files with create_imagenet.sh" Step does not package the training set and the test set successfully.

This is the answer to 3-2 of the three remaining questions from the last blog, which means there is a problem with the input data. The generated Lmdb file is not valid, which is why taking such a Lmdb file to train will fail.

Issue two: Generate Lmdb file error check Failed:mdb_status = = 0 (0)

The flag for Lmdb file packaging failure occurs when you run create_imagenet.sh:

Check failed: mdb_status == 0 (112 vs. 0)

On the internet for a long time, and finally in a csdn question to see, some people say because Windows can not use the Lmdb data format, so should be in create_imagenet.sh this step to set up a leveldb file.

This is the last 3-1 of the previous blog answer, as long as the error prompts, it means that your Lmdb data is not available. I don't know if this is the only problem with Windows, so the solution to this error is not to generate Lmdb format files, but to set create_imagenet.sh to generate LEVELDB formatted files.

Question three: How do I use create_imagenet.sh to generate files in LEVELDB format?

Now it's time to fix 3-3 of the last blog. As we know, the default data format in Caffe is Lmdb, so it is not necessary to specify in the previous blog whether the generated Lmdb or leveldb,create_imagenet.sh will automatically generate Lmdb-formatted files. But also said that, Windows is not able to use the Lmdb format file (can not be used to know, anyway, I am packaging jpg image into a lmdb format file This step will be wrong), so need to Create_ Imagenet.sh makes some settings that allow it to package JPG images into files in leveldb format.

The method is to include the following line in the create_imagenet.sh. sql file:

--backend=leveldb

, add this line to the position of the statement, you can specify the create_imagenet.sh to generate LEVELDB format of the file, do not forget the back backslash.

This part of the work will be performed in Caffe_root/data/myself, where myself is my own new folder, which puts all the training data train, all the test data Val, The TXT file train.txt and val.txt for each training data and test data label are given.

After executing create_imagenet.sh, the myself folder will have two new folders Imagenet_train_leveldb and Imagenet_val_leveldb, which is the generated leveldb file.

3 calculate image mean 3-1 generate Compute_image_mean.exe file

Follow the steps in the Caffe Learning Note (iii) to generate the required EXE file under VS2013, load the Compute_image_mean.cpp file under the Tools folder into the VS2013 project of Caffe, and then generate a Compute_ Image_mean.exe.

Note, however, that the name generated after build is Caffe.exe, so that the Compute_image_mean.cpp file will not be erased for the next CPP file build, it should be compute_image_ This Caffe.exe generated by Mean.cpp is renamed Compute_image_mean.exe.

3-2 Edit make_imagenet_mean.sh

Under Caffe_root/examples/imagenet, there is an sh file make_imagenet_mean.sh, copy it to Caffe_root/data/myself, and then open the file for editing.

It should be noted that the default is to mean the Lmdb file as well, so you must specify "Yes to Leveldb file averaging". If you do not add this sentence, a "check failure" will appear.

When the make_imagenet_mean.sh is executed, a file Imagenet_mean.binaryproto is generated under caffe_root/data/myself.

4 Network definition

This section of the content please be sure to contact the "Caffe study Note (a) Caffe_example training mnist", the blog used by the network, although different from the lenet, more than a "Set average file path" (that is, the imagenet_ just generated Mean.binaryproto), but the general process is the same, it is necessary to set up "two prototxt files + one sh file".

4-1 Setting the Train_val.prototxt file

It should be noted that in the "network definition" step, Shikayu's notes are not recorded in this file, he is copied from the Caffe_root\examples\imagenet two files Imagenet_train.prototxt and imagenet_ The val.prototxt are used for training networks and test networks respectively. As described in his notes, I can infer that my train_val.prototxt is actually a combination of imagenet_train.prototxt and imagenet_val.prototxt.

Why is it different from him? ' Cause I couldn't find imagenet_train.prototxt and imagenet_ in my caffe_root\examples\imagenet. Val.prototxt, so go to Caffe's official website search, found Caffe official website in Examples\imagenet also no that two prototxt files. and the official website of the Imagenet This example, but there is a description:

We is going to describe a reference implementation for the approach first proposed by Krizhevsky, Sutskever, and Hinton I n their NIPS-paper.
The network definition
( models/bvlc_reference_caffenet/train_val.prototxt ) follows the one in Krizhevsky et al.

Depending on the path given in the parentheses, you can find the Train_val.prototxt file, copy it to a TXT file, and change the suffix name to Prototxt.

Now let's edit the Train_val.prototxt file, as in the Caffe study Note (a) Caffe_example training mnist, set IMAGENET_TRAIN_LEVELDB and Imagenet_val _leveldb path, in addition to the average file just calculated Imagenet_mean.binaryproto path is also set.

Be careful to note that the default backend is Lmdb, we have to change to LEVELDB.

4-2 Setting the Solver.prototxt file

This file is also not found in caffe_root\examples\imagenet, similar to the above, in the Caffe official website Imagenet This example to get Solver path:

Sound good? This is implemented in
models/bvlc_reference_caffenet/solver.prototxt.

Now set Solver as follows:

The net is the path of the Train_val.prototxt, Snapshot_prefix is the training of the network weight of the storage path, as well as the CPU or GPU is also specified here.

4-3 Setting the train_caffenet.sh file

This file can indeed be found in the caffe_root/examples/imagenet, copy it to myself, set as follows:

is actually set up three things:

(1) caffe.exe--is the executable file used to train the network, which is the build of the Caffe.cpp file under Tools.

(2) train--mode selection, meaning is now the training mode.

(3) solver--is the path of the Solver.prototxt file that you just set up.

This section is exactly the same as the setting in the Caffe study Note (i) Caffe_example training mnist.

5 Training Network

After setting up the train_caffenet.sh, we began to train the network, the speed is very slow ... Basically, every hour gives a current loss report.

It can be seen, however, that as the number of iterations increases, the loss becomes smaller (it can be seen that it can be used after 20 iterations), which is what we want.

2016.5.26

By Yau Wang Nanshan

Caffe Study Notes (v) run your own JPG data with Caffe

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.