Independent Component Analysis and demo

Source: Internet
Author: User
Tags abs assert diff rand

1. Independent Component Analysis Modeling

The objective of independent component analysis is to realize the complete unit orthogonal basis in the mass data, so the following optimization problems can be established:

Among them, the first is sparse constraints, the second is a complete set of orthogonal basis constraints, familiar with the sparse representation may notice, why the above optimization problem is not explicitly added constraints, because W is an orthogonal matrix, so the above error is 0.

2. Solving the optimization problem of independent component analysis

(1) Sparse constrained optimization: the optimization problem of converting to the following energy functional

The derivative is as follows:

(2) Complete unit orthogonal basis constraint optimization: The projection is realized, the projection is as follows:

Note that since our complete unit orthogonal basis is derived from data, it is necessary to compare the Zca white and projection of the data samples before the iterative training as follows:


(3) Optimization steps:

Step1: Centralized training data, ZCA white processing

Step2: Updating w According to gradient descent method

Step3: Projecting W

STEP4: Whether satisfies the convergence condition, satisfies then the end, otherwise repeats step2-3;


(4) Sample code:

Percent cs294a/cs294w independent Component analysis (ICA) Exercise% instructions%------------% This file contains C Ode that helps you get started on the% ICA exercise.

In this exercise, you'll need to modify% orthonormalicacost.m and a small part of the this file, ICAEXERCISE.M. %%====================================================================== percent STEP 0:initialization% here we

Initialize some parameters used for the exercise.
Numpatches = 20000;
Numfeatures = 121;
Imagechannels = 3;
Patchdim = 8;

Visiblesize = Patchdim * Patchdim * imagechannels;
OutputDir = '. '; Epsilon = 1e-6; % L1-regularisation Epsilon | Wx| ~ sqrt (Wx) ^2 + epsilon)%%====================================================================== percent STEP 1:sample
Patches patches = Load (' Stlsampledpatches.mat ');
Patches = Patches.patches (:, 1:numpatches);

Displaycolornetwork (Patches (:, 1:100));  %%====================================================================== percent STEP 2:zca whiten patches%In this step, we ZCA whiten the sampled patches.

This was necessary for% orthonormal ICA to work.
patches = patches/255;
Meanpatch = mean (patches, 2);
Patches = Bsxfun (@minus, Patches, meanpatch);
% data for ZCA white processing sigma = patches * patches ';
[U, S, v] = SVD (sigma);
Zcawhite = u * DIAG (1./sqrt (diag (s))) * u ';

Patches = Zcawhite * PATCHES; %%====================================================================== percent STEP 3:ica cost functions% Implement the C  ost function for Orthornomal ICA (you don't have to% enforce the orthonormality constraint in the cost function)% in
The function orthonormalicacost in ORTHONORMALICACOST.M.

% Once You have implemented the function, check the gradient. % use less features and smaller patches for speed debug = false;
When the% is true, the gradient calculation is performed to verify if debug numfeatures = 5;
Patches = Patches (1:3, 1:5);
Visiblesize = 3;

Numpatches = 5;

Weightmatrix = rand (Numfeatures, visiblesize); [Cost, Grad] = Orthonormalicacost (Weightmatrix, Visiblesize, NUMFEAtures, patches, epsilon); Numgrad = Computenumericalgradient (@ (x) orthonormalicacost (x, Visiblesize, numfeatures, Patches, epsilon),
Weightmatrix (:)); 
% uncomment to display the numeric and analytic gradients side-by-side% disp ([Numgrad grad]);
diff = Norm (numgrad-grad)/norm (Numgrad+grad);
fprintf (' orthonormal ICA difference:%g\n ', diff); Assert (diff < 1e-7, ' difference too large.

Check your analytic gradients. '); fprintf (' congratulations!
Your gradients seem okay.\n '); End%%====================================================================== percent STEP 4:optimization for orthonormal Ica% Optimize for the orthonormal ICA objective, enforcing the orthonormality% constraint.  
Code have been provided to does the gradient descent with a% backtracking line search using the Orthonormalicacost function
% (For more information on backtracking line search, you can read the% appendix of the exercise). 
% However, you'll need to write code to enforce the Orthonormality% constraint by projecting Weightmatrix back into the space of matrices% satisfying ww^t = I.% Once you were done, You can run the code. 10000 iterations of gradient% descent would take around 2 hours, and only a few bases would be% completely learned withi N 10000 iterations. This highlights one of the% weaknesses of orthonormal ica-it are difficult to optimize for the% objective function wh

Ile enforcing the orthonormality constraint-% convergence using gradient descent and projection is very slow.

Weightmatrix = rand (Numfeatures, visiblesize); [Cost, Grad] = Orthonormalicacost (Weightmatrix (:), visiblesize, numfeatures, patches, epsilon);% compute gradient fprintf ('%11s%16s

%10s\n ', ' iteration ', ' cost ', ' t ');

StartTime = Tic ();
% Initialize Some parameters for the backtracking line search alpha = 0.5;
t = 0.02;

Lastcost = 1e40; % do 10000 iterations of gradient descent for iteration = 1:50000 Grad = reshape (grad, size (we
    Ightmatrix)); Newcost = Inf
    
    Lineardelta = SUM (SUM (grad. * grad)); % Perform the backtracking line search while 1 Considerweightmatrix = Weightmatrix-alpha * GRAD; % Gradient Descent method update W%--------------------YOUR CODE here--------------------% instructions:% Write cod
        E to project Considerweightmatrix back into the space% of matrices satisfying ww^t = I. % Once that was done, verify that your projection was correct by% using the checking code below.
        
After you has verified your% code, comment out the checking code before running the optimization. % Project Considerweightmatrix such that it satisfies ww^t = I percent error (' Fill in the ' code for th        
        E projection here '); Considerweightmatrix = (Considerweightmatrix*considerweightmatrix ') ^ ( -0.5) *considerweightmatrix; % W projection% Verify that the projection is correct temp = Considerweightmatrix * ConSiderweightmatrix ';
        temp = Temp-eye (numfeatures);         ASSERT (SUM (temp (:). ^2) < 1e-23, ' Considerweightmatrix does not satisfy ww^t = I. Check your projection again ');% Error (' Projection seems okay.
        
        Comment out verification code before running optimization. '); %--------------------YOUR CODE here--------------------[Newcost, NEWG
        RAD] = Orthonormalicacost (Considerweightmatrix (:), visiblesize, numfeatures, patches, epsilon); If newcost > Lastcost-alpha * t * Lineardelta% fprintf ('%14.6f%14.6f\n ', Newcost, Lastcost-alpha
            * t * Lineardelta);
        t = 0.9 * t;
        else break;
    End end lastcost = Newcost;
    
    Weightmatrix = Considerweightmatrix;
    
    fprintf ('%9d%14.6f%8.7g\n ', iteration, Newcost, T);
    
    t = 1.1 * t;
    Cost = Newcost;
           
    Grad = Newgrad; % visualize the learned bases as WE go along if mod (iteration, +) = = 0 Duration = TOC (startTime); % visualize the learned bases over time on different figures so percent we can get a feel for the slow rate of converge
        NCE figure (Floor (iteration/1000)); 
    Displaycolornetwork (Weightmatrix '); End end% visualize the learned bases displaycolornetwork (Weightmatrix ');


% function [Cost, Grad] = Orthonormalicacost (theta, visiblesize, numfeatures, patches, epsilon)%%orthonormalicacost-co Mpute the cost and gradients for orthonormal ICA% (i.e. compute the cost | | wx| |
_1 and its gradient)% Weightmatrix = Reshape (theta, numfeatures, visiblesize);
% Cost = 0;
% Grad = zeros (numfeatures, visiblesize); %%--------------------YOUR code here--------------------% instructions:% Write code to Comp     
Ute the cost and gradient with respect to the% of weights given in Weightmatrix. %--------------------YOUR CODE here--------------------% num_samples = size (patches,2);
% of samples% Aux1 = sqrt (((weightmatrix*patches). ^2) + epsilon); % cost = SUM (Aux1 (:))/num_samples;% Grad = ((weightmatrix*patches)./aux1) *patches './num_samples;% Grad = Gra D (:);% end% function [Cost, Grad] = Orthonormalicacost (theta, visiblesize, numfeatures, patches, epsilon)%orthonormalicacost-compute the cost and gradients for orthonormal ICA% (i.e. compute the cost | | wx| |
    
    _1 and its gradient) Weightmatrix = reshape (theta, numfeatures, visiblesize);
    Cost = 0;
    
    Grad = zeros (numfeatures, visiblesize); %--------------------YOUR code here--------------------% instructions:% Write code to compute the cost and     
    Gradient with respect to the% of weights given in Weightmatrix. %--------------------YOUR CODE Here--------------------percent method lambda = 8e-6;% 0.5e-4 num_samples = siz
    
    E (patches,2);
    Cost_part1 = SUM (sum ((Weightmatrix ' *weightmatrix*patches-patches). ^2))./num_samples;
    Cost_part2 = SUM (sum (sqrt ((weightmatrix*patches). ^2+epsilon)) *lambda;
    Cost = Cost_part1 + Cost_part2;
        Grad = (2*weightmatrix* (Weightmatrix ' *weightmatrix*patches-patches) *patches ' + ... 2*weightmatrix*patches* (Weightmatrix ' *weightmatrix*patches-patches) ')./num_samples+...
    
    (Weightmatrix*patches./sqrt ((weightmatrix*patches). ^2+epsilon)) *patches ' *lambda;
    Grad = Grad (:);
    fprintf ('%11s%16s\n ', ' cost_part1 ', ' cost_part2 ');
    
fprintf ('%14.6f%14.6f\n ', Cost_part1, Cost_part2); End

Note: When calculating gradients, the | | W ' wx-x| | The derivation of the ^2.

function Numgrad = Computenumericalgradient (j, theta)% Numgrad = Computenumericalgradient (j, theta)% theta:a vector of Parameters% J:a function that outputs a real-number. 
  
Calling y = J (theta) would return the% function value at theta.

% Initialize Numgrad with zeros Numgrad = zeros (Size (theta)); Percent----------YOUR CODE here--------------------------------------% instructions:% Implement numerical gradient Checki  
Ng, and return the result in Numgrad.
% (see section 2.3 of the lecture notes.)  % should write code so that Numgrad (i) was (the numerical approximation to) the% partial derivative of J with respect  
To the i-th input argument, evaluated at Theta.
% i.e., Numgrad (i) should is the (approximately) the partial derivative of J with% respect to theta (i). 

% hint:you would probably want to compute the elements of Numgrad one at a time.
Epsilon = 1e-8;
n = size (theta,1);
E = Eye (n);
    For i = 1:n Delta = E (:, i) *epsilon; Numgrad (i) = (J (tHeta+delta)-j (Theta-delta))/(epsilon*2.0);
 End---------------------------------------------------------------End


function Displaycolornetwork (A)% display receptive field (s) or basis vector (s) for image patches% A the basis
    , with patches as column vectors% of case the midpoint are not set at 0, we shift it dynamically if min (A (:)) >= 0

A = A-mean (A (:)); End cols = Round (sqrt (Size (A, 2));
channel_size = Size (a,1)/3;
Dim = sqrt (Channel_size);
Dimp = dim+1;
rows = ceil (Size (a,2)/cols);
B = A (1:channel_size,:);
C = A (channel_size+1:channel_size*2,:);
D = A (2*channel_size+1:channel_size*3,:);
B=b./(Ones (Size (b,1), 1) *max (ABS (B)));
C=c./(Ones (Size (c,1), 1) *max (ABS (C)));
D=d./(Ones (Size (d,1), 1) *max (ABS (D)));

% initialization of the image I = ones (dim*rows+rows-1,dim*cols+cols-1,3); %transfer features to this image matrix for i=0:rows-1 for j=0:cols-1 if i*cols+j+1 > size (b, 2) B Reak end% This sets the patch I (i*dimp+1:i*dimp+dim,j*dimp+1:j*dimp+dim,1) = ... reshape (B (:, i*
    Cols+j+1), [Dim dim]); I (i*dimp+1:i*dimp+dim,j*dimp+1:j*dimp+dim,2) = Reshape (C (:, I*cols+j+1), [Dim dim]);

  I (i*dimp+1:i*dimp+dim,j*dimp+1:j*dimp+dim,3) = ... reshape (D (:, I*cols+j+1), [Dim dim]);
End end i = i + 1;
I = I/2; 
Imagesc (I);


 Axis equal axis off end


The first 100 samples are shown below:
Complete unit orthogonal basis for training when iterating 10,000 times


Reference: http://www.cnblogs.com/dmzhuo/p/5009274.html

Http://deeplearning.stanford.edu/wiki/index.php/Exercise:Independent_Component_Analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.