Deep Learning Series (13) Transfer Learning and Caffe depth learning

Source: Internet
Author: User
Tags shuffle
1. Transfer Learning

In practice, because of the size of the database, we usually do not start from scratch (random initialization of parameters) to train convolution neural networks. Instead, it is usually done on a large database (for example, Imagenet, a 1000-class image classification database with a total of 1.2 million) for CNN training, a trained network (hereinafter referred to as Convnet), and convnet in the following two ways to use our project: The convnet is used as a feature extractor. Remove the convnet final layer, and use the remainder of the convnet as a feature extractor in our classification task. For alexnet, this will generate a 4096-D eigenvector, using this eigenvector to train a linear classifier (linear SVM or Softmax classifier) for use in our classification tasks. Fine-tune the convnet with data from the new database. For the Imagenet network convnet, we can fine-tune its weighting parameters using the reverse propagation algorithm on the new database. During the fine-tuning process, you can adjust the parameters of all layers, or you can fix the parameters of the previous layers and fine-tune the following layers. The reason for this is that the first few layers of the pre-trained network are usually generic features (such as Edge Extractor), and the latter layers of the network are related to the database and classification tasks, so that only the following layers can be adjusted when a new database and classification task is given. 2. How to Fine-tune

In fine-tune, exactly which way to choose the transfer Learning. There are many factors to consider, the two most important are the size of the new database and its similarity to the pre-training database, according to the different configurations of these two factors, there are four scenarios: the new database is small, and the pre-training database is similar. Because the database is relatively small, fine-tune words may produce a fit, the better approach is to use the pre-trained network as a feature extractor, and then train the linear classifier on the new task. The new database is large and similar to the pre training database. In this case, do not worry about the fitting, you can safely fine-tune the entire network. The new database is small and is not similar to the pre-training database. At this time, can not fine-tune, using the training network to remove the last layer as a feature extractor is not appropriate, the feasible solution is to use the pre-training network in front of several levels of activation value as a feature, and then train the linear classifier. The database is large and is not similar to the pre-training database. You can start from scratch, or you can fine-tune on a training basis. 3. Fine-tune with Caffe

As with starting from scratch with CNN, fine-tuning with Caffe can also be broken down into four steps: converting the data into Caffe-capable formats, defining net, defining solver, and training on the basis of the weight of the training. 3.1. Data format Conversion

The most commonly used format is lmdb, assuming that we download the database is the format of the picture, the training library in the TXT file, each row contains the image of the address and category label [Path/to/image.jpeg] [Label]:

/home/dumengnan/caffe-master/data/flickr_style/images/12123529133_c1fb58f6dc.jpg
/home/dumengnan/ Caffe-master/data/flickr_style/images/11603781264_0a34b7cc0a.jpg ...

The function used to convert it to Lmdb is Convert_imageset, and the contents of the called. sh file are as follows:

example=data/flickr_style data=data/flickr_style tools=build/tools TRAIN_DATA_ROOT=/# Set Resize=true to RESIZE the images to 256x256.
Leave as False if images have # already been resized using another tool. Resize=true if $RESIZE; Then resize_height=256 resize_width=256 else resize_height=0 resize_width=0 fi echo "Creating Train Lmdb ..." # GL Og_logtostderr meaning is output log in/tmp directory # Shuffle meaning is to sort database pictures glog_logtostderr=1 $TOOLS/convert_imageset \--resize_height=$ Resize_height \--resize_width= $RESIZE _width \--shuffle \ $TRAIN _data_root \ $DATA/train.txt \ $EXAMP

Le/flickr_train_lmdb echo "Creating Test Lmdb ..." Glog_logtostderr=1 $TOOLS/convert_imageset \--resize_height= $RESIZE _height \--resize_width= $RESIZE _width \- -shuffle \ $TRAIN _data_root \ $DATA/test.txt \ $EXAMPLE/flickr_test_lmdb echo "done." 

The Lmdb file is saved in the Flickr_train_lmdb and Flickr_test_lmdb folders after the conversion is complete. 3.2. Definition net

When you fine-tune the parameters, the. prototxt file of the Save net can copy the. prototxt file used in the training directly, and then make the following modifications: Modify the data layer, modify the output layer, including the name of the layer and the number of output categories, and reduce the size of the batch (proportional to the size of the To reduce the previous level of learning rate (you can set the learning rate to 0, the so-called freezing of the parameters of this layer), raise the last level of learning rate.

The reason for modifying the name of a layer is to use a different name, and the parameters of that layer in the pre-training network are reinitialized, as shown in the following figure:


3.3. Define Solver

Saving solver. prototxt files also does not need to be written again, copying the. prototxt files used during the training, making the following modifications: NET from the pre-trained net to the now-used net; lower the learning rate (divided by 100), The reason for this is that compared to the random initialization of the weight parameters, the pre training parameters are relatively good, do not need to update too fast, you can modify the maximum number of iterations and snapshot. 3.4. Training

Compared with the training of random initialization of parameters and training on the basis of training, a-weight parameter is added when invoking the command, and the storage location of the pre training parameter is given, as follows:

./build/tools/caffe Train-solver Models/finetune_flickr_style/solver.prototxt-weights Models/bvlc_reference_ Caffenet/bvlc_reference_caffenet.caffemodel
4. Results of the fine-tune

The new database is the Flickr database, classification task is to carry out 20 categories of classification (pre-training classification task is 1000 categories), using the number of pictures is about 700, the machine's graphics card is GTX970, after 6 hours of training (iteration number is 80,000 times), classification accuracy convergence to 34% Around.

Resources:
1. http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html
2. http://cs231n.github.io/transfer-learning/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.