References: learning multiple layers of features from tiny images: Appendix
Set the dimension of dataset X to d x n, and the data has been centralized.
The covariance matrix is
1/(n-1) * x * x'
If we want to make any two dimensions of the n d-dimension vectors irrelevant, assume that the de-correlated matrix is W.
Y = W * x
Y * y' must be a diagonal array, which further limits y to meet
Y * y' = (n-1) I
Add limitations to the W matrix (mainly to facilitate the derivation below)
W = W'
Then
Y * y '= (n-1) I
W * x * X '* W' = (n-1) I
W' * w * x * x' * w = (n-1) * W' = (n-1) * w
So w ^ 2 * x * x' = (n-1) I
W = SQRT (n-1) * (x * x') ^ (-1/2)
X * x' is symmetric semi-definite, so it can be decomposed into p * D * P', where D is a diagonal array, P is an orthogonal array, (x * x ') ^ (A) = p * d ^ (a) * P
So w = p * d ^ (-1/2) * P ';
Matlab code
Testzca. m
Clear; clcpatches = []; Tx = imread('test.jpg'{}load('pcadata.txt ','-ASCII '); Tx = double (TX); X = zeros (SIZE (TX, 1) * size (TX, TX, 2), size (TX, 3); Tx = TX (:); for I = 1: size (x, 2) x (:, I) = TX (1 + size (x, 1) * (I-1): size (x, 1) * I); endpatches = x'; % for I = 1: size (x, 2); % im = imread (strcat ('train \ ', num2str (I), '.png'); % im = reshape (IM, [1, 32*32*3]); % im = double (IM); % centralize % IM (* 32) = IM (* 32) -Mean (IM (* 32); % IM (32*32 + 1: 2*32*32) = IM (32*32 + 1: 2*32*32)-mean (IM (32*32 + 1: 2*32*32); % IM (2*32*32 + 1: 3*32*32) = IM (2*32*32 + 1: 3*32*32)-mean (IM (2*32*32 + 1: 3*32*32); % patches = [patches, Im ']; % Endy = zca_whitening (patches); y = y * y '; y = zca_normalize (y); covx = 1/1000 * patches '; covx = zca_normalize (covx); hold onsubplot (1, 2, 1) imshow (uint8 (covx )) subplot (1, 2) imshow (uint8 (y ));
Function Y = zca_normalize (x) [row, Col] = size (x); Tx = []; for I = 1: Row Tx = [Tx, X (I, :)]; endtx = TX-min (TX); Tx = Tx/MAX (TX) * 255; y = zeros (row, col); for I = 1: row y (I, :) = TX (I-1) * Col + 1: I * col); endend
% Assume that each d-dimensional data is 0-means, then the covariance matrix of the n d-dimensional data matrix X (D * n) is % covx = 1/(n-1) * x * x' % in order to eliminate the correlation between dimensions, W is transformed to obtain y, that is, 'remove correlation matrix W' under y = W * x % '. because W eliminates the correlation, y * Y is a diagonal array, So w satisfies % Y * y '= (n-1) * I % because many w meet the conditions, the Derivation after W = W' % is natural, what is hard to think of is to divide x * x' into p * D * P' Function Y = zcawhitening (x) [dim, N] = size (X ); [p, d] = Schur (x * x'); W = SQRT (n-1) * p * d ^ (-1/2) * P'; y = W * X; End
-- The dataset is a match taken from tiny image classification on kaggle.