From self-learning to deep network-build your 1st deep Network Classifier

Source: Internet
Author: User

Self-learning is a softmax classifier connected by a sparse encoder. As shown in the previous section, the training is performed 400 times with an accuracy of 98.2%.

On this basis, we can build our first in-depth Network: stack-based self-coding (2 layers) + softmax Classifier

 

In short, we use the output of the sparse self-Encoder as the input of a higher layer of sparse self-encoder.

Like self-learning, it seems that a new layer is added, but it is not:

The new technique is that we have a fine-tuning process to let the residual be passed to the input layer from the highest level and fine-tune the entire network weight.

 

This fine-tuning is very obvious for improving network performance, as we will see later.

 

Network Structure:


Figure 1

Pre-Load

minFunccomputeNumericalGradientdisplay_networkfeedForwardAutoencoderinitializeParametersloadMNISTImagesloadMNISTLabelssoftmaxCostsoftmaxTrainsparseAutoencoderCosttrain-images.idx3-ubytetrain-labels.idx1-ubyte


Train the first sparse Encoder

addpath minFunc/options.Method = 'lbfgs';options.maxIter = 400;options.display = 'on';[sae1OptTheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...                                   inputSize,hiddenSizeL1, ...                                   lambda, sparsityParam, ...                                   beta, trainData), ...                              sae1Theta, options);

Train the second sparse Encoder

sae2options.Method = 'lbfgs';sae2options.maxIter = 400; sae2options.display = 'on';[sae2OptTheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...                                   hiddenSizeL1, hiddenSizeL2, ...                                   lambda, sparsityParam, ...                                   beta, sae1Features), ...                              sae2Theta, sae2options);

Train A softmax Classifier

smoptions.maxIter = 100;[softmaxModel] = softmaxTrain(hiddenSizeL2, numClasses, lambda, ...                            sae2Features,trainLabels, smoptions);saeSoftmaxOptTheta = softmaxModel.optTheta(:);

Fine-tune the entire network

ftoptions.Method = 'lbfgs';ftoptions.display = 'on';ftoptions.maxIter = 100;[stackedAEOptTheta, cost] = minFunc( @(p) stackedAECost(p,...                                              inputSize,hiddenSizeL2, ...                                              numClasses, netconfig, ...                                              lambda,trainData,trainLabels), ...                                                                 stackedAETheta, ftoptions);

Cost Function and Gradient

A2 = sigmoid (bsxfun (@ plus, stack {1 }. W * data, stack {1 }. b); a3 = sigmoid (bsxfun (@ plus, stack {2 }. W * A2, stack {2 }. b); temp = softmaxtheta * A3; temp = bsxfun (@ minus, temp, max (temp, [], 1 )); % prevent data overflow hypothesis = bsxfun (@ rdivide, exp (temp), sum (exp (temp ))); % get the probability matrix COST =-(groundtruth (:) '* log (hypothesis (:))/m + lambda/2 * sumsqr (softmaxtheta ); % cost function softmaxthetagrad =-(groundtruth-hypothesis) * A3 '/m + Lambda * softmaxtheta; % gradient function delta3 = softmaxtheta' * (hypothesis-groundtruth ). * A3. * (1-a3); delta2 = (stack {2 }. w' * delta3 ). * A2. * (1-a2); stackgrad {2 }. W = delta3 * A2 '/m; stackgrad {2 }. B = sum (delta3, 2)/m; stackgrad {1 }. W = delta2 * Data '/m; stackgrad {1 }. B = sum (delta2, 2)/m;

Prediction Functions

A2 = sigmoid (bsxfun (@ plus, stack {1 }. W * data, stack {1 }. b); a3 = sigmoid (bsxfun (@ plus, stack {2 }. W * A2, stack {2 }. B ));[~, PRED] = max (softmaxtheta * A3); % records the sequence number of the maximum probability instead of the maximum value.

After more than two hours of training, the final result is very good:

Beforefinetuning test accuracy: 86.620%

Afterfinetuning test accuracy: 99.800%

 

It can be seen that fine-tuning plays a vital role in deep network training.


Welcome to the discussion and follow up on this blog, Weibo, and zhihu personal homepage for further updates ~

Reprinted, please respect the work of the author and keep the above text and link of the article completely. Thank you for your support!


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.