"UFLDL" exercise:convolutional neural Network

Source: Internet
Author: User

The exercise needs to complete the calculations of forward pass,cost,error and gradient in CNN. It is necessary to understand the principle of the above four steps in each layer, and to make full use of MATLAB matrix operations. Probably summed up the process as shown:

STEP 1:implement CNN Objective

STEP 1a:forward Propagation

Forward propagation is mainly to calculate the output of the input image after the neural network, the network has three layers: convolution->pooling->softmax (dense connected), Convolution layer for each image with all the template convolution, pooling layer to the convolution layer output feature map sampling, Softmax layer according to pooling layer output feature predict the classification results of the image. The convolution and pooling operations have been implemented before. Specific procedures can be found in the forward pass for each layer of the specific operation. The code is as follows:

%%% YOUR CODE here%%%%call the previously implemented function activations= Cnnconvolve (Filterdim, numfilters, images, Wc, BC);%sigmoid (wx+b) activationspooled=Cnnpool (Pooldim, activations);% reshape activations into 2-d Matrix, hiddensize x numimages,% forSoftmax Layer% will activationspooled from outdim*outdim*numfilters*numimages stitching into hiddensize*large matrix activationspooled of Numimages=reshape (activationspooled,[],numimages);%%Softmax Layer%Forward propagate the pooled activations calculated above into a%Standard Softmax layer. For your convenience we have reshaped%activationpooled into a hiddensize x numimages matrix. Store the% resultsinchprobs.% numclasses x numimages forstoring probability that each image belongs to% eachclass. probs=zeros (numclasses,numimages);%%% YOUR CODE here%%%h= exp (Bsxfun (@plus, Wd *activationspooled,bd)); Probs= Bsxfun (@rdivide, H,sum (h,1));

STEP 1b:calculate Cost

The objective function to optimize the gradient descent is divided into two parts, part of which is the error function caused by the difference between the classifier output and the real result, and the other part is the regular constraint of the weighted W. The first part can refer to the calculation of loss function in Softmax regression, and the second part is the sum of squares of all the items of WC and WD. Similar to the following formula, but J in the first item is Softmax's cross entropy loss function. Finally, dividing the first item by the total number of images is very important, and at first I did not divide, the final algorithm is divergent, the reason may be that the first value is too large, directly the effect of the regular item is ignored.

Code:

%%% YOUR CODE here%%%=sub2ind (Size (LOGP), Labels', 1:size (probs,2)); Cecost =-Lambda/2 * (SUM (Wd (:). ^2) +sum (Wc (:). ^2= cecost/numimages + wcost;

STEP 1c:backpropagation

The BP algorithm first calculates the contribution of each layer to the final error delta.

Softmax layer: This layer of error is easiest to calculate, as long as the ground truth minus the output of the neural network probs can be:

Output == 1= probs-output;

Pool layer: This layer first calculates the error of the layer based on the formula ΔL = wδl+1 * f ' (ZL) (the pool layer does not have an F ' (ZL)), at which point a hiddensize*numimages matrix is obtained, First, using the Reshape function to restore the error to a convdim*convdim*numfilters*numimages matrix, in pooling operation, pooling layer of a node input is conv layer Output of 2 nodes (assuming pooldim=2) as shown in:

So the pooling layer of this node to its own error in the 2 * 2 nodes in the average (because the use of mean pooling), UFLDL above suggested that you can use KRON this function to implement, as shown, you can pass the pooling layer of a The error of the filter corresponding to 2 gets the error of a 4*4 filter in the convolution layer. The code is as follows:

Deltapool = reshape (Wd'  * deltasoftmax,outputdim,outputdim,numfilters,numimages); Deltaunpool = zeros (convdim,convdim,numfilters,numimages);  for imnum = 1: numimages    for filternum = 1: numfilters         = Deltapool (:,:, Filternum,imnum);         = Kron (Unpool,ones (Pooldim))./(Pooldim ^ 2);    EndEnd

Convolution layer: or according to the formula ΔL = wδl+1 * f ' (ZL) to calculate:

Deltaconv = deltaunpool. * Activations. * (1-activations);

STEP 1d:gradient Calculation

The entire CNN has a total of three layers: Convolution->pooling->softmax (dense connected), only the convolution and softmax layers have the right to weigh, respectively, wc,bc,wd,bd. Then calculate the target function J to their reciprocal to update W and b in the gradient drop.

Gradient calculations for WD and BD:

According to the following two formulas:

Where al-1 corresponds to the pooling layer of excitation (output) Activitonspooled,δl is the error Deltasoftmax this layer, the code is as follows:

Wd_grad = (1./numimages). * deltasoftmax*activationspooled'+lambda*wd; Bd_grad = (1./numimages). * SUM (deltasoftmax,2);

Gradient calculations for WC and BC:

or according to the above two calculation of the gradient formula, but trouble on the L-1 layer is actually the input image, so al-1 corresponding to the input image, then you have to use for the loop one by one to facilitate the image and use the formula provided on the UFLDL to calculate the corresponding gradient:

For convenience, all deltaconv are rotated first, and then the gradient is calculated in the For loop in turn:

%%% YOUR CODE here%%%Wd_grad= (1./numimages). * deltasoftmax*activationspooled'+lambda*wd;Bd_grad = (1./numimages). * SUM (deltasoftmax,2); Bc_grad=Zeros (Size (BC)); Wc_grad=zeros (filterdim,filterdim,numfilters); forFilternum = 1: Numfilters Error=Deltaconv (:,:, Filternum,:); Bc_grad (Filternum)= (1./numimages). *sum (Error (:)); End%Rotate All Dealtaconv forFilternum = 1: Numfilters forImnum = 1: Numimages Error=Deltaconv (:,:, Filternum,imnum); Deltaconv (:,:, Filternum,imnum)= Rot90 (error,2); EndEnd forFilternum = 1: Numfilters forImnum = 1: Numimages Wc_grad (:,:, Filternum)= Wc_grad (:,:, Filternum) + conv2 (Images (:,:, Imnum), Deltaconv (:,:, Filternum,imnum),'valid'); Endendwc_grad= (1./numimages). * Wc_grad +Lambda*WC;
Step 2:gradient Check

At that time obviously my gradient decline can not converge, this step actually passed the =. =

Step 3:learn Parameters

This step is relatively simple, according to the ufldl of the random gradient drop interpretation, in the MINFUNCSGD with the impact of the impulse can be:

%%% YOUR CODE here%%%        = mom*velocity+alpha*grad;         = Theta-velocity;
Step 4:test

Run Cnntrain, the final accuracy rate can reach 97% +

The above can be UFLDL on the implementation of CNN, the most important thing is to figure out each layer in each process needs to be done, I summarize in the article at the beginning of the table ~matlab give me a big feeling is the matrix of demension match, sometimes know the formula is what kind of, However, to consider the dimensions of the matrix, the two-dimensional match matrix can be multiplied or added, but the benefit is that you don't know how to write the code when the result dimension match is written. And CNN is really hard to debug, and I don't know where the problem is. =

The full code is on my github.

Reference:

"1" http://ufldl.stanford.edu/tutorial/supervised/ExerciseConvolutionalNeuralNetwork/

"2" http://blog.csdn.net/lingerlanlan/article/details/41390443

"UFLDL" exercise:convolutional neural Network

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.