From self-learning to deep network-build your 1st deep Network Classifier

Last Update:2014-08-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Self-learning is a softmax classifier connected by a sparse encoder. As shown in the previous section, the training is performed 400 times with an accuracy of 98.2%.

On this basis, we can build our first in-depth Network: stack-based self-coding (2 layers) + softmax Classifier

In short, we use the output of the sparse self-Encoder as the input of a higher layer of sparse self-encoder.

Like self-learning, it seems that a new layer is added, but it is not:

The new technique is that we have a fine-tuning process to let the residual be passed to the input layer from the highest level and fine-tune the entire network weight.

This fine-tuning is very obvious for improving network performance, as we will see later.

Network Structure:

Figure 1

Pre-Load

minFunccomputeNumericalGradientdisplay_networkfeedForwardAutoencoderinitializeParametersloadMNISTImagesloadMNISTLabelssoftmaxCostsoftmaxTrainsparseAutoencoderCosttrain-images.idx3-ubytetrain-labels.idx1-ubyte

Train the first sparse Encoder

addpath minFunc/options.Method = 'lbfgs';options.maxIter = 400;options.display = 'on';[sae1OptTheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...                                   inputSize,hiddenSizeL1, ...                                   lambda, sparsityParam, ...                                   beta, trainData), ...                              sae1Theta, options);

Train the second sparse Encoder

sae2options.Method = 'lbfgs';sae2options.maxIter = 400; sae2options.display = 'on';[sae2OptTheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...                                   hiddenSizeL1, hiddenSizeL2, ...                                   lambda, sparsityParam, ...                                   beta, sae1Features), ...                              sae2Theta, sae2options);

Train A softmax Classifier

smoptions.maxIter = 100;[softmaxModel] = softmaxTrain(hiddenSizeL2, numClasses, lambda, ...                            sae2Features,trainLabels, smoptions);saeSoftmaxOptTheta = softmaxModel.optTheta(:);

Fine-tune the entire network

ftoptions.Method = 'lbfgs';ftoptions.display = 'on';ftoptions.maxIter = 100;[stackedAEOptTheta, cost] = minFunc( @(p) stackedAECost(p,...                                              inputSize,hiddenSizeL2, ...                                              numClasses, netconfig, ...                                              lambda,trainData,trainLabels), ...                                                                 stackedAETheta, ftoptions);

Cost Function and Gradient

A2 = sigmoid (bsxfun (@ plus, stack {1 }. W * data, stack {1 }. b); a3 = sigmoid (bsxfun (@ plus, stack {2 }. W * A2, stack {2 }. b); temp = softmaxtheta * A3; temp = bsxfun (@ minus, temp, max (temp, [], 1 )); % prevent data overflow hypothesis = bsxfun (@ rdivide, exp (temp), sum (exp (temp ))); % get the probability matrix COST =-(groundtruth (:) '* log (hypothesis (:))/m + lambda/2 * sumsqr (softmaxtheta ); % cost function softmaxthetagrad =-(groundtruth-hypothesis) * A3 '/m + Lambda * softmaxtheta; % gradient function delta3 = softmaxtheta' * (hypothesis-groundtruth ). * A3. * (1-a3); delta2 = (stack {2 }. w' * delta3 ). * A2. * (1-a2); stackgrad {2 }. W = delta3 * A2 '/m; stackgrad {2 }. B = sum (delta3, 2)/m; stackgrad {1 }. W = delta2 * Data '/m; stackgrad {1 }. B = sum (delta2, 2)/m;

Prediction Functions

A2 = sigmoid (bsxfun (@ plus, stack {1 }. W * data, stack {1 }. b); a3 = sigmoid (bsxfun (@ plus, stack {2 }. W * A2, stack {2 }. B ));[~, PRED] = max (softmaxtheta * A3); % records the sequence number of the maximum probability instead of the maximum value.

After more than two hours of training, the final result is very good:

Beforefinetuning test accuracy: 86.620%

Afterfinetuning test accuracy: 99.800%

It can be seen that fine-tuning plays a vital role in deep network training.

Welcome to the discussion and follow up on this blog, Weibo, and zhihu personal homepage for further updates ~

Reprinted, please respect the work of the author and keep the above text and link of the article completely. Thank you for your support!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

From self-learning to deep network-build your 1st deep Network Classifier

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

From self-learning to deep network-build your 1st deep Network Classifier

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support