Pattern Recognition (Recognition) Learning notes (35)--K-L Transformation and PCA

Source: Internet
Author: User

theoretical knowledge of K-L transformation

K-L transformation is another common feature extraction method besides PCA, it has many forms, the most basic form is similar to PCA, it differs from PCA in that PCA is a unsupervised feature transformation, and K-L transform can take different classification information and realize supervised feature extraction.

According to the KL expansion theory in stochastic process, the stochastic process is described as a linear combination of numerous orthogonal functions, and in pattern recognition problem, a sample can usually be regarded as the result of a random vector, so a D-dimensional random vector x can be written as a linear combination of orthogonal bases, and their modulus is 1:


The deformation of the upper type is obtained:

(The first K-L transformation usually requires a 0-value or translation of the sample )

Assuming that the useful information is centered on the Q-dimension, let's try to approximate x with Q-Dimension:


The difference vectors of the approximate front and back sample vectors are:

The mean square error (MSE) for the above difference vectors is:

Where the transformation matrix is the second-order matrix of the original sample vector X (Note that this can also be other matrices, such as the covariance matrix), can be compared with the PCA, the form is roughly the same, but the transformation matrix used in PCA is the covariance matrix;

Our aim is to minimize the above MSE, with the solution method in PCA, to get the following Lagrange objective function:


To differentiate Sigma and make it equal to zero, there are:


See familiar faces, haha, is the eigenvalues, so the above requirements of the mean square error to solve the mystery of the Veil:


Analysis here, it should be easy to see, simply with the PCA is a pair of twins Ah, too like a wood, in fact, when the K-L transformation matrix is the covariance matrix, the K-L transformation becomes the PCA.

Back to the question of using the Q-dimension approximation sample vector x, through the above analysis we know that if you want to use Q Bellavita to represent the sample vector and minimize the MSE, the reasonable way is: the transformation matrix of the eigenvalues from large to small arrangement, and then select the first q eigenvalues corresponding to the characteristics of the line, the truncation error can guarantee the minimum , in which the first Q orthogonal vectors constitute a new feature space, and the original sample vector x in this new feature space, the expansion coefficient Yi is a new feature vector, this transformation is called the K-L transformation , for its other different forms, mainly based on the specific form of the transformation matrix.

It can be found that the new feature of Q is similar to the D principal component in PCA, and the K-L transform is equivalent to PCA when the original feature X is centered.

several important properties of K-L transformation

1. The new features obtained after the transformation meet the 0 mean:

Prove:

Has the following K-L transformation:, where matrix A is a transformation kernel matrix;

The mean value of the Y-sphere for the X transformation result:


2.k-l transformation is an orthogonal transformation;

The new features of 3.k-l transform are not related to each other;

The second order matrix of the new eigenvector of the 4.k-l transform is the diagonal array, and the diagonal element is the eigenvalues of the second-order matrix of the original feature;

Prove:



The 5.k-l transform is the best compression representation of the signal, and the error caused by the original sample is the smallest in all q Korimasa coordinate transformations with the Q-Reformed feature.

6. The original data is represented by the K-L coordinate system, which means that the entropy is the least, that is, the variance information of the sample is concentrated in less dimension;

the relation and difference between K-L transform and PCA

Contact:

All belong to orthogonal transformation;

When the original feature X is centered (that is, the transformation matrix is the covariance matrix), the K-L transform is equivalent to PCA.

PCA is a discrete K-l transformation;

Can realize the reduced dimension transformation;

Difference:

The K-L transformation can realize supervised feature extraction, but the transformation of PCA is unsupervised;

In the meaning, the K-L transformation is more generalized and the PCA is narrower.

The K-L transformation can deal with continuous and discrete conditions, while PCA is only for discrete cases;

The transformation matrix of K-L transform can be many kinds, such as second-order matrix, covariance matrix (total scatter matrix), or self-correlation matrix, and the transformation matrix of PCA is the covariance matrix.

However, in some places there is no difference between the two, because the actual application of the covariance matrix, or the autocorrelation matrix, in fact, is only a difference to the sample to mean the translation, but in PCA this translation does not affect the direction of the main component, so the PCA is usually the first translation of the sample, Thus, the autocorrelation matrix becomes the covariance matrix.


Covariance matrix:


Autocorrelation Matrix:


Wherein, is the conjugate transpose matrix, when is the real matrix, is equivalent to the transpose matrix;

The relationship between covariance matrices and autocorrelation matrices:



Reference: Wiki





Pattern Recognition (Recognition) Learning notes (35)--K-L Transformation and PCA

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.