"Deeplearning" Exercise:learning color features with Sparse autoencoders

Source: Internet
Author: User
Tags diff

Exercise:learning color features with Sparse autoencoders

Exercise Link:exercise:learning color features with Sparse autoencoders

Sparseautoencoderlinearcost.m

function [Cost,grad] =Sparseautoencoderlinearcost (Theta, Visiblesize, hiddensize, ... lambda, sparsityparam, beta, data)% visiblesize:the number of input units (probably -)% hiddensize:the number of hidden units (probably -)%lambda:weight Decay parameter% sparsityparam:the desired average activation forThe Hidden units (denotedinchThe lecture% notes by the Greek alphabet Rho, which looks like a lower- Case "P").%beta:weight of sparsity penalty term% Data:our 64x10000 matrix containing the training data. So, data (:, i) isThe I-th training example.% The input theta isa vector (because Minfunc expects the parameters to be a vector).% We First convert theta to the (W1, W2, B1, B2) matrix/vector format, so This%follows the notation convention of the lecture notes.% W1 (i,j) denotes the weight fromj_th nodeinchinput layer to i_th node%inchHidden layer. Thus it isA hiddensize*visiblesize matrixW1= Reshape (Theta (1: hiddensize*visiblesize), hiddensize, visiblesize);% W2 (i,j) denotes the weight fromj_th nodeinchhidden layer to i_th node%inchOutput layer. Thus it isA visiblesize*hiddensize matrixW2= Reshape (Theta (hiddensize*visiblesize+1:2*hiddensize*visiblesize), visiblesize, hiddensize);% B1 (i) denotes the i_th biasinchInput layer to i_th nodeinchhidden layer.% Thus It isA hiddensize*1VECTORB1= Theta (2*hiddensize*visiblesize+1:2*hiddensize*visiblesize+hiddensize);% B2 (i) denotes the i_th biasinchHidden layer to i_th nodeinchoutput layer.% Thus It isA visiblesize*1VECTORB2= Theta (2*hiddensize*visiblesize+hiddensize+1: End);Percent----------YOUR CODE here--------------------------------------% instructions:compute the cost/optimization objecti ve J_sparse (w,b) forthe Sparse Autoencoder,%and the corresponding gradients W1grad, W2grad, B1grad, B2grad.Percent W1grad, W2grad, B1grad and B2grad should be computedusingbackpropagation.% Note that W1grad have the same dimensions asW1, B1grad has the same dimensions% asB1, etc. Your Code shouldSetW1grad to be thePartialderivative of J_sparse (w,b) with% respect to W1. i.e., W1grad (I,J) should be thePartialderivative of J_sparse (w,b)%With respect to the input parameter W1 (I,J). Thus, W1grad should is equal to the term% [(1/m) \delta w^{(1)} + \lambda w^{(1)}]inchThe last block of Pseudo-codeinchSection2.2% of the lecture notes (and similarly forW2grad, B1grad, B2grad).Percent stated differently,ifWe wereusingbatch gradient descent to optimize the parameters,% The gradient descent update to W1 would is W1: = W1-alpha * W1grad, and similarly forW2, B1, B2.%%1. Set \delta w^{(1)}, \delta b^{(1)} to0  forAll layer L%Cost and gradient variables (your code needs to compute these values).%Here , we initialize them to zeros. W1grad=Zeros (Size (W1)); W2grad=Zeros (Size (W2)); B1grad=Zeros (Size (B1)); B2grad=Zeros (Size (B2)); M= Size (data,2);%for small data, save activation information during computing rho% 2a. Use BackPropagation to Compute diff (J_sparse (w,b;x,y), w^{(1)})% and diff (J_sparse (w,b;x,y), b^{(1)})% 2a.1. Perform a Feedforward pass, computing the activations for%hidden layer and output layer.% Z2 isA hiddensize*m MATRIXZ2= W1*data + Repmat (B1,1, m);% A2 isA hiddensize*m Matrixa2=sigmoid (z2);% Z3 isA visiblesize*m matrixz3= W2*a2 + Repmat (B2,1, m);% A3 isA visiblesize*m Matrixa3=Z3;% Rho isA hiddensize*1Vectorrho= SUM (A2,2); Rho= Rho./m;% Klterm isA hiddensize*1Vectorklterm= beta* (-sparsityparam./Rho + (1-sparsityparam)./(1-rho));%accumulate the Costcost=1/2* SUM (SUM (DATA-A3). * (data-( A3) ));% 2a.2. For the output layer,Setdelta3% DELTA3 isA visiblesize*m matrixdelta3=-(data-A3);% 2a.3. For the hidden layer,SetDelta2% DELTA2 isA hiddensize*m Matrixdelta2= (W2'*delta3 + repmat (klterm,1,m)). * Sigmoiddiff (z2);% 2a.4. Compute the desiredPartialderivatives% Jw1diff isA hiddensize*visiblesize Matrixjw1diff= Delta2 * Data';% Jb1diff isA hiddensize*m Matrixjb1diff=Delta2;% Jw2diff isA visiblesize*hiddensize Matrixjw2diff= delta3 * A2';% Jb1diff isA visiblesize*m Matrixjb2diff=delta3;% 2b. Update \delta w^{(1)}w1grad= W1grad +Jw1diff; W2grad= W2grad +Jw2diff;% 2c. Update \delta b^{(1)}b1grad= B1grad + sum (Jb1diff,2); B2grad= B2grad + sum (Jb2diff,2);%Compute KL Penalty termklpen= Beta * SUM (sparsityparam*log (sparsityparam./rho) + (1-sparsityparam) *log ((1-sparsityparam)./(1-rho ));%Compute weight Decay termtempW1= W1. *w1;tempw2= W2. *W2; WD= (lambda/2) * (SUM (sum (tempW1)) +sum (sum (tempW2));= Cost./m + WD +Klpen; W1grad= W1grad./m + Lambda. *W1; W2grad= W2grad./m + Lambda. *W2;b1grad= B1grad./M;b2grad= B2grad./m;%-------------------------------------------------------------------%3. Update the Parametersafter computing the cost and gradient, we 'll% convert the gradients back to a vector format (suitable forminfunc).%Specifically, we'll unroll your gradient matrices into a vector.grad=[W1grad (:); W2grad (:); B1grad (:); B2grad (:)];end%-------------------------------------------------------------------% here's an implementation of the sigmoid function, which your may find useful%inchyour computation of the costs and the gradients. This inputs a (row or%column) vector (Say (Z1, Z2, Z3)) and returns (F (Z1), F (Z2), F (Z3)). function Sigm=sigmoid (x) Sigm=1./ (1+ EXP (-x)); End%define the differential of sigmoidfunction Sigmdiff=Sigmoiddiff (x) Sigmdiff= sigmoid (x). * (1-sigmoid (x)); End

Results:

If running out is like this, may be a3 = z3 written a3 = sigmoid (Z3)

"Deeplearning" Exercise:learning color features with Sparse autoencoders

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.