Ufldl lab report 3: self-taught

Last Update:2014-10-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Self-taught self-learner Experiment Report

1. Self-taught self-learning experiment description

Self-learning is a unsupervised feature learning algorithm. Self-learning means that the algorithm can learn from data without being labeled, so that the machine learning algorithm can obtain a larger amount of data and thus obtain better performance. In this experiment, we will use the sparse self-Encoder and softmax classifier to construct a handwritten digital classifier based on the self-learning steps.

Implementation Process
Step 1: generate training input and test sample set
Step 2: Train the sparse self-Encoder
Step 3: extract features
Step 4: Train and test the softmax Classifier
Step 5: classify the test sample set and calculate the accuracy

3. Key points, codes, and comments in each step
Step 1: generate input and test sample sets
Use loadmnistimages. M and loadmnistlabels. m to load data from the mnist database. Pay attention to the path and name of the data file.

Step 2: Train the sparse self-Encoder
Use the training image without tags as the input and train the sparse self-encoder to obtain the optimal weight value. In this step, call minfunc. M and sparseautoencodercost. M obtained from the previous experiment.
The specific implementation code is as follows:
% Find opttheta by running the sparse autoencoder on
% Unlabeledtrainingimages

Opttheta = Theta;
% [Cost, grad] = sparseautoencodercost (Theta, inputsize, hiddensize, lambda ,...
% Sparsityparam, beta, unlabeleddata );
% Use minfunc to minimize the Function
Addpath minfunc/
Options. method = 'lbfg'; % here, we use L-BFGS to optimize our cost
% Function. Generally, for minfunc to work, you
% Need a function pointer with two outputs:
% Function value and the gradient. In our problem,
% Sparseautoencodercost. M satisfies this.
Options. maxiter = 400; % maximum number of iterations of L-BFGS to run
Options. Display = 'on ';

[Opttheta, cost] = minfunc (@ (p) sparseautoencodercost (p ,...
Inputsize, hiddensize ,...
Lambda, sparsityparam ,...
Beta, unlabeleddata ),...
Theta, options );

Step 3: extract features
Call feedforwardautoencoder in this step. m: Calculate the output (activation value) of the hidden layer unit of the sparse self-encoder. These outputs are higher-order features we extract from the training image without tags.
Add the following code to feedforwardautoencoder. M:
% Compute the activation of the hidden layer for the sparse autoencoder.
M = size (data, 2 );
Z2 = W1 * Data + repmat (B1, 1, M );
Activation = sigmoid (Z2 );

Step 4: Train and test the softmax Classifier
Use the softmaxcost. M and softmaxtrain. M obtained in the previous experiment to train the feature and training tag set trainlabels extracted in step 3, and obtain a multi-class classifier.
The specific implementation code is as follows:
Inputsize = hiddensize;
% C = unique (a) for the array a returns the same values as in a but
% No repetitions. C will be sorted.
% A = [9 9 9 9 9 9 8 8 8 7 7 7 6 6 6 5 4 2 1]
% C = unique (a)-> C = [1 2 4 5 6 7 8 9]
Numclasses = numel (unique (trainlabels ));
Lambda1 = 1e-4;

Options. Max iter = 100;
Softmaxmodel = softmaxtrain (inputsize, numclasses, lambda1 ,...
Trainfeatures, trainlabels, options );

Step 5: classify the test sample set and calculate the accuracy
Call the softmaxpredict. M obtained from the previous experiment to predict the test sample set and calculate the accuracy. The specific implementation code is as follows:
[Pred] = softmaxpredict (softmaxmodel, testfeatures );
Experiment results and running environment
We can see that the hidden layer unit learning extracts high-level features similar to the image edge.
Training accuracy:
Test accuracy: 98.247284%
It is basically consistent with the 98.3% given by lecture note
Training sample time consumption:
Elapsed time is 2955.575443 seconds.
About 50 minutes.

Running Environment
AMD A6-3420M Apu with radeon (TM) HD graphics 1.50 GHz
Ram: 4.00 GB (GB available)
OS: Windows 7, 32 bit
MATLAB: r2012b (8.0.0.783)
Appendix: some key codes and explanations of sparse self-Encoder

The expression of the hidden layer unit output (activation) is as follows:
It can also be expressed:

The vectorized expression is as follows:
This step is called Forward Propagation Forward propagation. More generally, for layers l and L + 1 in a neural network, there are:

Cost functions consist of three types:
Where
And.
Through iteration, try to make
The gradient of the cost function is used to calculate the prediction error using the backward propagation algorithm. The expression is as follows:
The algorithm calls minfunc () to update the W and B parameters to obtain a better prediction model.

The key to vectoring is to understand the dimension size of each variable. The dimension size of each variable is as follows:
The key implementation code is as follows:
Function [cost, grad] = sparseautoencodercost (Theta, visiblesize, hiddensize ,...
Lambda, sparsityparam, beta, data)
% ---------- Your code here --------------------------------------
[N, m] = size (data); % m is the number of Traning set, n is the num of features

% Forward Algorithm
% B = repmat (a, m, n)-> replicate and tile an array-> mxn
% B1-> B1 row vector 1xm
Z2 = W1 * Data + repmat (B1, 1, M );
A2 = sigmoid (Z2 );
Z3 = W2 * A2 + repmat (B2, 1, M );
A3 = sigmoid (Z3 );

% Compute first part of cost
Jcost = 0.5/M * sum (a3-data). ^ 2 ));

% Compute the weight decay
Jweight = 1/2 * Lambda * sum (W1. ^ 2) + 1/2 * Lambda * sum (W2. ^ 2 ));

% Compute the sparse penalty
% Sparsityparam (ROV): the desired average activation for the hidden units
% Rock (ROV ^): the actual average activation of Hidden Unit
Rock = 1/M * sum (A2, 2 );
Jsparse = beta * sum (sparsityparam. * log (sparsityparam./ROV) +...
(1-sparsityparam). * log (1-sparsityparam)./(1-rock )));

% The complete cost function
Cost = jcost + jweight + jsparse;

% Backward Propagation
% Compute gradient
D3 =-(data-a3). * sigmoidgradient (Z3 );
% Since we introduce the sparsity term -- jsparse in cost function
Extra_term = beta * (-sparsityparam./ROV + (1-sparsityparam)./(1-rock ));

% Add the extra term
D2 = (W2 '* D3 + repmat (extra_term, 1, m). * sigmoidgradient (Z2 );

% Compute w1grad
W1grad = 1/M * D2 * Data '+ Lambda * W1;

% Compute w2grad
W2grad = 1/M * D3 * A2 '+ Lambda * W2;

% Compute b1grad
B1grad = 1/M * sum (D2, 2 );

% Compute b2grad
B2grad = 1/M * sum (D3, 2 );

Ufldl lab report 3: self-taught

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Ufldl lab report 3: self-taught

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Ufldl lab report 3: self-taught

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support