Then the previous article:
Four: Transfer Learning:
1. For small or medium-sized data, migration learning is useful 2. Based on the Imagenet experiment, half of all classes of imagenet are divided into a-B: (1). First, train Part A and then save the parameters of the first n layers, then re-initialize the parameters of the n+ layer and train with part B. The previous saved parameters are then combined with the parameters obtained in section B of the training, and validated on the validation set of B:
(2). First, train Part A, after training a, re-initialize the parameters behind the n+ layer, then train on B, and finally verify on the validation set of B:
(3). Train part B, fix and save the parameters of the first n layer, then re-initialize the parameters of the n+ layer, and then train on B again; Finally, the first n-level parameters that were saved before are combined with the n+ layer parameters of retrain B to verify on the validation set of B:
(4). Train Part B, re-initialize the parameters of the n+ layer, re-train on B again, and finally verify on the validation set of B:
3. Summarize the above experimental results:
4. The following should be the principle of Li Feifei's Ted speech:
5. Some recommendations for working with small datasets:
V: Squeezing out of the last few Percent1. Using a small size filter is much better than using a large size filter, and a small size filter can increase the number of non-linearities and reduce the parameters that need to be trained (imagine a 7*7 patch with a 7 The filter convolution of the *7, and the filter convolution of the three-layer 3*3, results in a scalar)-more non-linearities and Deeper gives better results:
2. You can also try to work on the pool:
3.Data Augmentation: If the data set is not enough, you can also try to expand your dataset in the following ways: (1) Flip horizontally: You can enlarge the dataset by rotating the image, which is better if the original image is a square:
(2) Multi-scale cutting: Multi-scale cutting can not only increase the data set, but also improve the implementation of the results, a picture cut 150 times are very common:
(3) Various random combinations:
(4) Color jittering:
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
CNN for Visual rcognition---Stanford 2015 (ii)