UFLDL Learning notes and programming Jobs: multi-layer neural Network (Multilayer neural networks + recognition handwriting programming)

Source: Internet
Author: User

UFLDL Learning notes and programming Jobs: multi-layer neural Network (Multilayer neural networks + recognition handwriting programming)


UFLDL out a new tutorial, feel better than before, from the basics, the system is clear, but also programming practice.

In deep learning high-quality group inside listen to some predecessors said, do not delve into other machine learning algorithms, you can directly to learn DL.

So recently began to engage in this, the tutorial plus MATLAB programming, is perfect AH.

The address of the new tutorial is: http://ufldl.stanford.edu/tutorial/


This section study address: http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/


Neural Network general solution process:

1 forward propagation, the activation value of each layer is calculated, and the total cost.

Basically, the activation values of hidden layers are weighted and plus bias, and then activate functions such as sigmoid.

The activation value of the output layer, perhaps not called the activation value, is better than the eigenvalue value. Taking Softmax as an example, the activation value of the previous layer as the feature input x, the weight w as the theta parameter, the h is calculated according to the formula.

2 Reverse propagation.

The residual of the output layer is calculated first. This can be directly derivative based on the loss function.

The gradient of W and B of L-layer can be obtained by the residual of l+1 layer and the activation value of L layer.

The residual of L layer can be obtained by the residual of the l+1 layer and the L-layer W, and the partial derivative of the L-layer activation function.

3 Add weight attenuation to prevent overfitting. The cost and the gradient need to be adjusted accordingly.


Here is the code for SUPERVISED_DNN_COST.M:

function [Cost, Grad, pred_prob] = Supervised_dnn_cost (theta, EI, data, labels, pred_only)%spnetcostslave Slave Ction for simple phone net% Does all the work of cost/gradient computation% Returns cost broken into cross-entropy,     Weight norm, and ProX reg% components (Cecost, Wcost, pcost) percent default VALUESPO = False;if exist (' pred_only ', ' var ') PO = pred_only;end;%% reshape into Networknumhidden = Numel (ei.layer_sizes)-1;numsamples = Size (data, 2); hact = cell  (numhidden+1, 1); gradstack = cell (numhidden+1, 1); stack = Params2stack (theta, EI); percent forward prop%%% YOUR CODE here%%%for L=1:numhidden% hidden layer feature calculates if (L = = 1) z = stack{l}. W*data;else z = stack{l}. W*hact{l-1};endz = Bsxfun (@plus, z,stack{l}.b); Hact{l}=sigmoid (z); end% output layer (SOFTMAX) feature calculates h = (stack{numhidden+1}. W) *hact{numhidden};h = Bsxfun (@plus, h,stack{numhidden+1}.b), E = exp (h);p Red_prob = Bsxfun (@rdivide, E,sum (e,1)); % probability table hact{numhidden+1} = pred_prob;%[~,pred_labels] = max (Pred_prob, [], 1), percent return here if onlY predictions desired.if po cost =-1; Cecost =-1; Wcost =-1;    Numcorrect =-1;    Grad = []; return;end;%% compute cost output layer Softmax cost%%% YOUR CODE here%%%cecost =0;c= log (Pred_prob);%fprintf ("%d,%d\n", Size ( labels,1), size (labels,2)); %60000,1i=sub2ind (Size (c), labels ', 1:size (c,2));% finds a linear index of matrix C, the row is specified by labels, the column is specified by 1:size (c,2), and the resulting linear index is returned to Ivalues = C (I); Cecost =-sum (values); percent compute gradients using backpropagation%%% YOUR CODE here%%%% cross Entroy gradient%d = Full (Spa    RSE (Labels,1:size (c,2), 1)); D = zeros (size (pred_prob));d(I) = 1;error = (pred_prob-d); % residual gradient of the output layer, residual reverse propagation for L = numhidden+1: -1:1gradstack{l}.b = SUM (error,2); if (L = = 1) gradstack{l}.    W = Error*data '; When Break;%l==1, that is, when the current layer is the first layer of hidden layer, no redistribution of the residuals else gradstack{l} is required. W = error*hact{l-1} '; enderror = (Stack{l}. W) ' *error. *hact{l-1}.* (1-hact{l-1}),% of the following is the activation function partial derivative end%% compute weight penalty cost and gradient for Non-bias terms%%% YO UR CODE Here%%%wcost = 0;for L = 1:numhidden+1 wcost = wcost +. 5 * ei.lambda * SUM (stack{l}. W(:) . ^ 2);% of the ownership value squared and endcost = cecost + wcost;% Computing The gradient of the weight decay.for L = Numhidden: -1:1 gradstac K{l}. W = Gradstack{l}. W + Ei.lambda * stack{l}. W;%softmax useless to the weight decay item end%% reshape gradients into vector[grad] = Stack2params (gradstack); end


The original training set is 60,000, a little time, I changed the run_train.m code, the training set changed 10,000.

Of course it affects accuracy.




This article linger

This article link: http://blog.csdn.net/lingerlanlan/article/details/38464317



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.