PCA Essence and SVD

Source: Internet
Author: User

First, some concepts

linear correlation: One of the vectors can be represented by other vectors in a linear form.

linearly Independent: One of the vectors can not be linearly shown by other vectors, or the other is not finding an X that is not equal to 0, which can make ax=0. If for a matrix a the column is linearly independent, then ax=0, only 0 solutions, at which point the matrix A is reversible.

rank: The number of linearly independent vectors.

Base:

feature vectors: vector x is rotated by matrix A, with the original X collinear. is a characteristic value that represents the scaling of a vector. If the matrix is considered a matrix (rotation, stretching) for linear changes, then the eigenvector is a vector that, after this particular transformation, remains in the same direction, just stretching the length. In turn, the positive disclosure of eigenvectors is precisely a transformation that maps a matrix from one space to another.

Feature decomposition:

If matrix A is a symmetric matrix, it will get a stronger feature decomposition, a can achieve U diagonalization, that is, orthogonal diagonalization. eigenvector u orthogonal:

Second, PCA the essence (covariance matrix diagonalization, symmetric matrix feature decomposition)

When the amount of data is too large, dimensionality needs to be reduced. How to drop it? The need to ensure that the dimension is reduced, but at the same time the amount of information retained at the most, the conversion into mathematical terms is the variance of each row vector as large as possible (variance represents information), the covariance between row vectors is 0 (to make the row and row as far as possible irrelevant, the information as far as possible to a few separate variables, to achieve The following is an example of Matrix X, assuming that the obtained data is a matrix X,

Its covariance matrix is (after normalized data):

There are two steps needed to do 1. It is now desirable that the covariance matrix of the descending dimension be as large as possible (with enough information), that the non-diagonal element be as much as 0 (the row is independent of the row, and if the relevant description does not fall Wi Chenggung), it becomes a diagonal matrix. So we need to make a linear transformation of x so that the covariance matrix of the matrix after the linear transformation becomes the diagonal matrix. And how does the linear transformation do? Specifically, to make Q=uT , Cy will become a diagonal matrix, it is necessary to use the previous eigenvalue decomposition, because CX is a symmetric matrix, so you can perform feature decomposition:

How do we reduce dimensions after 2.Cy becomes a diagonal matrix? At this point, we need to sort the characteristic values on the diagonal element, at this point, we can throw away the corresponding information of the small characteristic value, at this point, the goal of reducing dimension is reached. The characteristic of this is the amount of information that is represented. Examples are as follows:

The code used in Python:

Http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

Third, SVD decomposition (singular matrix decomposition)

Before the decomposition of the symmetric matrix can achieve orthogonal diagonalization (ut=u-), and the symmetric matrix is built in rn*n space, and for any rank r matrix A belongs to the Rm*n space, can you find a similar decomposition? The answer is yes, this is the SVD decomposition, the singular value, and greater than 0.

What is the relationship between SVD and feature decomposition?

Singular value is the a*at of non-zero eigenvalue. In PCA application, the covariance matrix is positive definite matrix, and the singular decomposition essence of positive definite matrix (which must be symmetric matrix) is equivalent to feature decomposition.

SVD can also be reduced to dimensions.

PCA Essence and SVD

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.