UFLDL Tutorial Notes and Practice answers IV (establishing a classification with deep learning)

Last Update:2015-06-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is mainly from self-study to deep learning, simple record as follows:

(1) deep Learning is more expressive than shallow network learning, and it expresses much more function set than shallow network in a compact and concise way.

(2) The shortcomings of data acquisition, local maximum and gradient dispersion exist in the extension of traditional shallow neural networks.

(3) stack self-coding neural network is a neural network composed of multilayer sparse self-encoder ( softmax regression or logistic regression classification used in the last layer ) , the initial parameters are obtained by the greedy training method, which can make full use of untagged data in data acquisition. It is also called pre-training through a layer-wise greedy training method, which can then be fine-tuned using a tagged data set.

Exercise Answer:

(1) Percent STEP 2:train the first sparse autoencoder

Addpath minfunc/options. Method = ' Lbfgs '; Here, we use the L-BFGS to optimize our cost                          % function. Generally, for Minfunc to work, you                          % need a function pointer with the Outputs:the                          % function value and the Gradi Ent. In our problem,                          % sparseautoencodercost.m satisfies this.options.maxIter = n;  % Maximum number of iterations of L-BFGS to run Options.display = ' on '; [Sae1opttheta, Cost] = Minfunc (@ (P) sparseautoencodercost (p, ...                                   Inputsize, hiddenSizeL1, ...                                   Lambda, Sparsityparam, ...                                   Beta, traindata), ...

(2) % STEP 2:train The second sparse autoencoder

Options. Method = ' Lbfgs '; Here, we use the L-BFGS to optimize our cost                          % function. Generally, for Minfunc to work, you                          % need a function pointer with the Outputs:the                          % function value and the Gradi Ent. In our problem,                          % sparseautoencodercost.m satisfies this.options.maxIter = n;  % Maximum number of iterations of L-BFGS to run Options.display = ' on '; [Sae2opttheta, Cost2] = Minfunc (@ (P) sparseautoencodercost (p, ...                                   HiddenSizeL1, hiddenSizeL2, ...                                   Lambda, Sparsityparam, ...                                   Beta, sae1features), ...                              Sae2theta, Options);

(3) Percent STEP 3:train the Softmax classifier

Addpath minfunc/options. Method = ' Lbfgs '; Here, we use the L-BFGS to optimize our cost                          % function. Generally, for Minfunc to work, you                          % need a function pointer with the Outputs:the                          % function value and the Gradi Ent. In we problem,                          % softmaxcost.m satisfies this.minFuncOptions.display = ' on '; lambda = 1e-4;  [Saesoftmaxtheta, Cost3] = Minfunc (@ (P) softmaxcost (p, ...                                   Numclasses, hiddenSizeL2, lambda, ...                                   Sae2features, Trainlabels), ...                                                                 Saesoftmaxtheta, Options);

(4) Percent STEP 5:finetune Softmax model

Addpath minfunc/options. Method = ' Lbfgs '; Here, we use the L-BFGS to optimize our cost                          % function. Generally, for Minfunc to work, you                          % need a function pointer with the Outputs:the                          % function value and the Gradi Ent. In we problem,                          % softmaxcost.m satisfies this.minFuncOptions.display = ' on '; [ Stackedaeopttheta, Cost3] = Minfunc (@ (P) stackedaecost (p, ...                                   Inputsize, hiddenSizeL2, ...                                   Numclasses, Netconfig, ...                                   Lambda, Traindata, trainlabels), ...                                                                 Stackedaetheta, Options);

Stackedaecost.m

depth = Numel (stack); z = cell (depth+1, 1);        % input + hidden layer za = cell (depth+1, 1); % input + hidden layer excitation function a{1} = data;for i = 1:depth% compute hidden layer Z and excitation a z{i+1} = STACK{I}.W * A{i} + repmat (stack{i}.b, 1, NUMCA    SES);     A{i+1} = sigmoid (z{i+1}); endm = Softmaxtheta * a{depth+1}; % calculated Softmax corresponding to the excitation value M = Bsxfun (@minus, M, Max (M, [],1));   m = exp (m);  % P = bsxfun (@rdivide, M, sum (m));    Cost = -1/numcases. * SUM (Groundtruth (:) ' *log (P (:))) + LAMBDA/2 *sum (Softmaxtheta (:). ^2);               % Cost Functionsoftmaxthetagrad = -1/numcases. * (groundtruth-p) * a{depth+1} ' + lambda * softmaxtheta;          % grad Softmax corresponds to the parameter Delta = cell (depth+1);  The% error entry only needs to be calculated on the hidden layer delta{depth+1} =-(Softmaxtheta ' * (GROUNDTRUTH-P)). * A{depth+1}. * (1-a{depth+1}); The error corresponding to the last hidden layer can be deduced for the for layer = depth: -1:2 Delta{layer} = (STACK{LAYER}.W * delta{layer+1}). * A{layer}. * (1-a{l    Ayer}); Percent calculation the error corresponding to the previous layers does not take into account the coefficients and the Bayesian school parameters endfor layer = Depth: -1:1% calculates the gradient stack for each hidden layer parameter W and bGRAD{LAYER}.W = delta{layer+1} * a{layer} './numcases; stackgrad{layer}.b = SUM (delta{layer+1}, 2)./numcases;end

(5) Percent STEP 6:test

numcases = Size (data, 2);d epth = Numel (stack); z = cell (depth+1, 1);         % Pitchfork name Mitsu + 闅 Refer bookmark ba kinh crypto za = cell (depth+1, 1);        % Fork name Mitsu + 闅 Refer bookmark ba kinh crypto upsome} = credential i = 暟          % a{1 $ data;for 1:depth optin 畻 闅 refer bookmark z ba kinh crypto tel 屾    縺} = animals * Tapes} + z{i+1 (STACK{I}.W, 1, numcases);    A{i+1} = sigmoid (z{i+1}); end[~, pred] = max (Softmaxtheta * a{depth+1});

in the end I got the result: beforefinetuning Test accuracy:92.150% , After finetuning Test accuracy:96.680% and the answer to the exercise is a bit of an error.

UFLDL Tutorial Notes and Practice answers IV (establishing a classification with deep learning)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

UFLDL Tutorial Notes and Practice answers IV (establishing a classification with deep learning)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

UFLDL Tutorial Notes and Practice answers IV (establishing a classification with deep learning)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support