Self-learning is to concatenate a sparse self-Encoder and a softmax regression classifier.
Sparse encoder is used for unsupervised learning and uses unlabeled data.
Regression classifier is supervised learning and uses Label data.
In real life, we can easily obtain a large amount of unlabeled data (such as randomly downloading Massive images from the Internet)
It is difficult to obtain a large amount of tagged data (the database with tags is usually not too large and very expensive)
If we only have a small amount of tag data on hand, but there is a large amount of unlabeled data, we can use a self-learning method to obtain useful features, and thus achieve a much better effect than pure softmax.
We still use the minst database ~ 4. These handwritten data are used as unlabeled data ~ 9 The handwritten data is split into two parts, one part of which is the test data and the other part is the verification data.
In terms of programs, with the basis of the previous sections, it is better to call the relevant functions:
minFuncdisplay_networkinitializeParametersloadMNISTImagesloadMNISTLabelssoftmaxCostsoftmaxPredictsoftmaxTrainsparseAutoencoderCosttrain-images.idx3-ubytetrain-labels.idx1-ubyte
The core code of each part is as follows:
Stlexercise
opttheta = theta; addpath minFunc/options.Method = 'lbfgs';options.maxIter = 400;options.display = 'on';[opttheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ... inputSize, hiddenSize, ... lambda, sparsityParam, ... beta, unlabeledData), ... theta, options);
Feedforwardautoencoder
Activation = sigmoid (bsxfun (@ plus, W1 * data, B1 ));
Train A softmax Classifier
options.maxIter = 100;softmaxModel = softmaxTrain(hiddenSize, numLabels, 1e-4, ... trainFeatures,trainLabels, options);
Give Inferences
[pred] = softmaxPredict(softmaxModel,testFeatures);
Several variables to be distinguished may be mixed.
trainDatatrainLabelstestDatatestLabelstrainFeaturestestFeatures
This experiment consumes a long computing time, Andrew said
For us, the training step took less than 25 minutes on a fast desktop.
The test result on my ThinkPad I5 is about half an hour. I accidentally slide my hand and covered the original data. It took another half an hour...
After half an hour, you can see the features learned by the sparse self-encoder.
Figure 1
The final running effect is very good, which is greatly improved compared with softmax:
Test accuracy: 98.215453%
Welcome to the discussion and follow up on this blog, Weibo, and zhihu personal homepage for further updates ~
Reprinted, please respect the work of the author and keep the above text and link of the article completely. Thank you for your support!