recently want to study deep learning, from the beginning to see UFLDL (Unsuprisedfeature Learning and deep learning) tutorial, I put the answer to the after-class exercise here, as a note.
Notes:
: The self-coding algorithm is an unsupervised learning algorithm that learns hw,b (x) = x, so the last outputlayer inputlayer Span Style= "", the number of units is equal, while the middle 0 PCA
2 : Visual self-encoder, the problem is visualized in the exercise W1 , i.e. the parameters that need to be learned W1 . This I do not understand, and later thought, because the input is a pixel point of the image, then each hidden layer such as A1 (2) = w11x1+w12*x2+w13*x3+ ... , ~~ don't quite understand, then learn to look at the back.
Practice answers:
1 : Sparse self-encoder
Step1 : in the SAMPLEIMAGES.M Gets the code that generates the training set in the file, where Tic and the TOC It is used for the purpose of remembering.
Ticimage_size=size (IMAGES); I=randi (image_size (1)-patchsize+1,1,numpatches); % produces 1*10000 random number ranges between [1,image_size (1)-patchsize+1] J=randi (image_size (2)-patchsize+1,1,numpatches); K=randi (Image_ Size (3), 1,numpatches); % randomly selected picture 10,000 times for num=1:numpatches patches (:, num) =reshape (IMAGES (i (num): I (NUM) +patchsize-1,j (num): j (Num) + Patchsize-1,k (num)), 1,patchsize*patchsize); Endtoc
STEP2 : in the SPARSEAUTOENCODERCOST.M document to complete the forward propagation and back propagation and other related code
%1.forward propagationdata_size=size (data); % [10000]active_value2=repmat (B1,1,data_size (2)); % will B1 expand 10000 columns 25*10000active_value3=repmat (B2,1,data_size (2)); % will B2 expand 10000 columns 64*10000active_value2=sigmoid (w1*data+active_value2); The value matrix of the% hidden node represents all samples 25*10000 a column representing a sample hidden active_value3=sigmoid (W2*ACTIVE_VALUE2+ACTIVE_VALUE3); The value matrix of the output node represents all samples 64*10000 a column represents a sample output%2.computing error term and costave_square=sum (sum (active_value3-data). ^2). /2)/data_size (2); %cost the first minimum squared sum weight_decay=lambda/2* (sum (SUM (w1.^2)) +sum (sum (w2.^2))); %cost the second all parameters of the square and Bayesian school P_real=sum (active_value2,2)./data_size (2); The estimated p in the sparse penalty term is 25 D p_para=repmat (sparsityparam,hiddensize,1); % sparsity parameter sparsity=beta.*sum (P_para.*log (p_para./p_real) + (1-p_para). *log ((1-p_para)./(1-p_real))); %KL diversioncost=ave_square+weight_decay+sparsity; % final cost functiondelta3= (active_value3-data). * (ACTIVE_VALUE3). * (1-ACTIVE_VALUE3); The% error is that the 64*10000 matrix represents all samples, and each column represents a sample Average_spaRsity=repmat (sum (active_value2,2)./data_size (2), 1,data_size (2)); % the sparse term in error Default_sparsity=repmat (Sparsityparam,hiddensize,data_size (2)); % sparsity parameter sparsity_penalty=beta.* (-(default_sparsity./average_sparsity) + ((1-default_sparsity)./(1-average_sparsity
Step3: Gradient Test
Epsilon=0.0001;for i=1:size (theta) Theta_plus=theta; Theta_minu=theta; Theta_plus (i) =theta_plus (i) +epsilon; Theta_minu (i) =theta_minu (i)-epsilon; Numgrad (i) = (J (theta_plus)-j (Theta_minu))/(2*epsilon); end
STEP4: Visualization, Training train.m , the relevant gradient check related code is removed, because this part of the code is more time consuming.
2 : Vectorization Programming Implementation
This only needs to be changed slightly in the above code.
Step1: first set the parameter to
Visiblesize = 28*28; % number of input units hiddensize = 196; % number of hidden units Sparsityparam = 0.1; % desired average activation of the hidden units. % (This is denoted by the Greek alphabet Rho, which looks like a lower-case "P", % in the Lecture notes). Lambda = 3e-3; % weight decay parameter beta = 3; % weight of sparsity penalty term
STEP2 : in the sparse encoder, Step1 the way to get the training set is replaced by the following code:
Images = loadmnistimages (' train-images.idx3-ubyte ');d isplay_network (Images (:, 1:100)); % Show The first imagespatches = images (:, Randi (Size (images,2), 1, 10000));
This gives you the following visual results:
UFLDL Tutorial Exercise Answer one (sparse self-encoder and vectorization programming implementation)