PCA Singular Value Decomposition

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Singular Value and principal component analysis (PCA)

[Reprint] Original Source: http://blog.sina.com.cn/s/blog_4b16455701016ada.html

The PCA problem is actually a base transformation, which makes the transformed data have the largest variance. The variance size describes the amount of information about a variable. When we talk about the stability of a thing, we often say that we need to reduce the variance. If a model has a large variance, the model is unstable. However, for the data we use for Machine Learning (mainly training data), the variance is significant. Otherwise, if the input data is the same vertex, the variance is 0, in this way, multiple input data is equivalent to one data. The following figure shows an example:

This assumption is that a camera collects an image of an object's motion, and the above point indicates the position of the object's motion. If we want to fit these points with a straight line, what direction will we choose? Of course, it is the line marked with signal on the graph. If we simply project these points onto the X or Y axis, the variance obtained from the X and Y axes is similar (because the trend of these points is around 45 degrees, so it is similar to projection to the X or Y axis). If we look at these points using the original XY coordinate system, it is easy to see what the true direction of these points is. However, if we change the coordinate system, the horizontal axis is changed to the signal direction, and the vertical axis is changed to the noise direction, it is easy to find out which direction has a large variance and which direction has a small variance.

Generally, the direction of a large variance is the signal direction, and the direction of a small variance is the noise direction. In data mining or digital signal processing, we often need to increase the ratio of the signal to the noise, that is, the signal-to-noise ratio. For example, if we only keep the data in the signal direction, we can make a good approximation of the original data.

All the work of PCA is simply to find a group of orthogonal axes in sequence in the original space. The first axis maximizes the variance, the second axis has the largest variance in the plane orthogonal to the first axis, and the third axis has the largest variance in the plane orthogonal to the 1st and 2 axes, in this case, we can find N such coordinate axes in the n-dimensional space, and take the first R to approximate the space, so that we can compress from an n-dimensional space to the R-dimensional space, however, the R coordinate axes we selected can minimize the data loss caused by space compression.

Assume that each row of the matrix represents a sample, and each column represents a feature. The matrix language is used to change the coordinate axis of matrix A of M * n, P is a transformation matrix from an n-dimensional space to another n-dimensional space, in the space will be similar to the rotation, tensile changes.

Instead, a M * n matrix A is transformed into an M * r matrix, which will enable n feature, it turns into R feature (r <n). This R is actually a kind of refinement of N feature, and we call it the compression of feature. In mathematical language:

But how does this relate to SVD? Previously, the singular vectors obtained by SVD are arranged in ascending order of the singular values. From the perspective of PCA, the coordinate axis with the largest variance is the first singular vector, the coordinate axis with a large variance is the second singular vector... Let's take a look at the SVD formula:

Multiply a matrix V on both sides of the matrix. Because V is an orthogonal matrix, V transpose multiplied by V to get the unit array I. Therefore, it can be converted into the following formula.

Let's take a look at the following formula and the M * n matrix of a * P into the M * r matrix. Here, V is actually p, that is, a variable vector. Here we compress an M * n matrix to an M * r matrix, that is, compressing columns. If we want to compress rows (In PCA's opinion, to Compress rows, we can understand that some similar samples are merged or some samples with little value are removed.) What should we do? Similarly, we will write a general example of row compression:

In this way, we compress a matrix of m rows to a matrix of R rows, which is the same for SVD. we multiply the two sides of the form of SVD decomposition by the U transpose U'

In this way, we get the row compression formula. It can be seen that PCA is almost a packaging of SVD. If SVD is implemented, PCA is implemented. What's better is that with SVD, we can get PCA in two directions. If we break down the feature value of a' A, we can only get PCA in one direction.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

PCA Singular Value Decomposition

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

PCA Singular Value Decomposition

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support