Turn! Singular value decomposition and geometric significance

Source: Internet
Author: User

PS: Since the SVD decomposition indefinitely, this text is translated, the original text with meticulous analysis + a large number of visual graphics to demonstrate the meaning of SVD. It is not easy to explain the problem so clearly in a limited space. The original text gave a simple image processing problem, simple image, sincerely hope that the passing of the road friends from different angles to explain their own understanding of the actual meaning of SVD, such as personalized recommendations in the application of SVD, text and web mining often use the SVD.

English Original: We recommend a singular value decomposition

Brief introduction

SVD is actually a mathematical specialty, but it has now infiltrated into different fields. The SVD process is not very well understood, because it is not intuitive enough, but its effect on matrix decomposition is very good. For example, Netflix, a company that offers online movie leasing, once offered a $1 million reward, if anyone could improve the accuracy of its film recommendation system scoring forecast by 10%. Surprisingly, this goal is fraught with challenges, with teams from around the world using a variety of different technologies. The ultimate winning team "Bellkor's pragmatic Chaos" uses the core algorithm based on SVD.

SVD provides a very convenient matrix decomposition method, which can discover the interesting potential patterns in the data. In this article, we will provide an understanding of SVD geometry and some simple examples of applications.

the geometrical meaning of linear transformation (Thegeometry of linear transformations)

Let's take a look at some simple linear transformation examples, taking a linear transformation matrix of 2 X 2 For example, first of all, a more specific, diagonal matrix:

Geometrically,M is a transformation matrix that transforms a point (x, y) on a two-dimensional plane from a linear transformation to another point, as shown in

The effect of the transformation, as shown, is that the transformed plane is only stretched 3 times times along the X level, and the vertical direction is not changed.

Now look at the matrix.

This matrix produces a transformation effect as shown in

This kind of transformation effect looks very strange, in the actual environment it is difficult to describe the transformation of the law (here should be not clear to identify the angle of rotation, the ratio of stretching and other information). Or based on the symmetric matrix above, let's say we rotate the left plane 45 degrees, and then the linear transformation of the matrix M , as shown in the result:

Does it look a little familiar? Yes, after M linear transformation, the function of the diagonal matrix is the same as the previous one, which stretches the mesh 3 times times in a direction.

M Here is a special case because it is symmetrical. Non-special is that we often meet some asymmetric, non-matrix matrices in practical applications. As shown, if we have a 2 X 2 symmetric matrix m , we first rotate the mesh plane to a certain angle, and the transformation effect ofm is to stretch the transformation in two dimensions.

In a more mathematical way, given a symmetric matrix M , we can find some orthogonal vi , satisfying the MVi is the stretching transformation along the vi direction, the formula is as follows:

M vi =λiVI

The λi here is the stretch scale (scalar). In terms of geometry,M stretches the vector Vi and maps the transformation. Vi is called the eigenvector of the Matrix M (eigenvector), λi is called as the matrix m eigenvalue (eigenvalue). There is a very important theorem here that the eigenvectors of the symmetric matrix M are orthogonal to each other.

If we use these eigenvectors to linearly transform the mesh plane, then the linear transformation of the grid plane through the M matrix is the same as the linear transform of the eigenvector of the M matrix.

For more common matrices, what can we do to make a grid plane (orthogonal grid) that is perpendicular to each other, linearly transforming into another grid plane perpendicular? PS: Here the vertical, is the two staggered lines are vertical.

After the above matrix transformation effect

As you can see, it does not achieve the effect we want. We rotate the grid plane 30 degrees, and then the same linear transformation after the effect, as shown in

Let's look at the effect when the grid plane rotates at a 60-degree angle.

Well, this looks pretty good. If the precise point, the grid plane should be rotated 58.28 degrees to achieve the desired effect.

Geometrical meaning

This part is to understand the two-dimensional SVD from the geometrical level: for any 2 x 2 matrix, the SVD can transform a mutually perpendicular mesh (orthogonal grid) into another perpendicular mesh.

We can describe this fact by means of vectors: first, select two mutually orthogonal unit vectors v1 and v2, vector mv1 and mv2 orthogonal.

U1 and U2 represent unit vectors for mv1 and mv2 respectively, σ1 * u1 = m v1 and σ2 * u2 = Mv2. Σ1 and σ 2 respectively represent the modulus on this different direction vector, also known as the singular value of the matrix M.

So we have the following relationship

M v1 =σ1U1
M v2 =σ2U2

We can now briefly describe the expression of vector x after M-linear transformation. Since vectors v1 and v2 are orthogonal unit vectors, we can get the following equation:

x = (v1x)v1 + (v2x)v2

This means that:

M x = (v1x)mv1 + (v2x)m V2
M x = (v1x) σ1U1 + (v2x) σ2U2

The inner product of a vector can be represented by the transpose of the vector, as shown below

v x = vTx

The final formula is

M x = u1σ1 v1tx + u2σ2 v2tx
M =U1σ1 v1t + U2σ2 v2t

These formulas are often expressed as

M =UΣVT

The column vectors of the u Matrix are U1,U2, Σ is a diagonal matrix, the diagonal elements are corresponding σ1 and σ2 respectively, and the column vectors of theV matrix are v1respectively. v2. The upper corner Mark T denotes the transpose of the matrix V .

This means that any matrix M can be decomposed into three matrices. v represents the standard orthogonal base of the original domain,u represents the standard orthogonal base of the M-transformed Co-domain, and σ represents the relationship between the vector in V and the corresponding vector in u . (V describes an orthonormal basis in the domain, and U describes a orthonormal basis in the Co-domain, Andσdescribes Ho W much the vectors in V is stretched to give the vectors in U.)

how to get singular value decomposition? (How do we find the singular decomposition?)

In fact, we can find the singular value decomposition of any matrix, so how do we do it? Suppose you have a unit circle in the original field, as shown in. After M-matrix transformation, the unit circle in the Co-domain becomes an ellipse, its long axis (mv1) and the short axis (mv2) correspond to the converted two standard orthogonal vectors, respectively. is also the longest and shortest two vectors within the ellipse range.

In other words, a function defined on a unit circle | M x| The maximum and minimum values are obtained in the direction of v1 and v2 respectively. So we narrowed the singular value decomposition of the search matrix to the optimization function | M x| It's up. The result is found (the specific push-to-process is not described in detail here) This function obtains the optimal value of the vector is the matrix MT M of the eigenvectors respectively. Since MTM is a symmetric matrix, the eigenvectors corresponding to the different eigenvalues are orthogonal to each other, and we use VI to represent all the eigenvectors of the MTM. Singular Value Ōi = | M VI| , the vector UI is a unit vector in the Mvi direction. But why is the UI orthogonal?

Tear down as follows:

Ōi and Σj are different two singular values respectively.

M VI =σiUI
M VJ =σjuj.

Let's first look at the mvimVJand assume that they correspond to singular values that are not zero. On the one hand the value of this expression is 0, pushed to the following

M vi mvj = vitmt mVJ = vi mtm VJ =λjVI VJ = 0

On the other hand, we have

M vi MVJ =σiσj UI Uj = 0

Therefore, theUI and Uj are orthogonal. In practice, however, this is not a method of solving singular values, and the efficiency is very low. This is not the main discussion of how to solve the singular value, in order to demonstrate convenience, the use of the second matrix.

Application example (another example)

Now let's look at a few examples.

Instance One

After this matrix transformation, the effect is as shown

In this example, the second singular value is 0, so there is only one expression in the direction after the transformation.

M =U1σ1 v1T.

In other words, if some singular values are very small, the corresponding items can appear differently in the decomposition of matrix M. Therefore, we can see that the size of the rank of the matrix M equals the number of non-0 singular values.

Example Two

Let's take a look at the application of singular value decomposition in data representation. Suppose we have the following image data of 25 x.

, the image is mainly composed of the following three parts.

We represent the image as a matrix of x 25, the elements of the matrix correspond to the different pixels of the image, and if the pixels are white, take 1, and the black takes 0. We've got a matrix with 375 elements, as shown in

If we decompose the singular value of matrix m, the singular values are

σ1 = 14.72
σ2 = 5.22
Σ3 = 3.31

The matrix M can be expressed as

M=u1σ1 v1T + u2σ2 v2T + u3σ3 v3T

VI has 15 elements, theUI has 25 elements, and the ōi corresponds to different singular values. As shown, we can use 123 elements to represent the image data with 375 elements.

Example Three

Noise Reduction (Noise reduction)

The singular values of the previous examples are not zero, or they are relatively large, let's explore the case of having 0 or very small singular values. In general, the larger singular value corresponds to the part that contains more information. For example, we have a scanned, noisy image, as shown in

We process the scanned image in the same way as the instance two. Get the singular value of the image matrix:

σ1 = 14.15
σ2 = 4.67
σ3 = 3.00
Σ4 = 0.21
Σ5 = 0.19
...
σ15 = 0.05

Obviously, the first three singular values are much larger than the singular values that follow, so the decomposition of the matrix M can be as follows:

M u1σ1 v1T + u2σ2 v2T + u3σ3 v3T

After the singular value decomposition, we get a noise reduction image.

Example Four

Data analysis

There is always noise in the data we collect: No matter how sophisticated the device is, how good the method is, there will always be some error. If you remember the above, the large singular value corresponds to the main information in the matrix, it is quite reasonable to use SVD to analyze the data and extract the main parts.

As an example, if the data we collect is as follows:

We represent the data in the form of a matrix:

After the singular value decomposition, we get

σ1 = 6.04
σ2 = 0.22

Since the first singular value is much larger than the second one, the data contains some noise, and the second singular value can be omitted in the corresponding part of the original matrix decomposition. After SVD decomposition, the main sample points are retained

As far as preserving the main sample data, this process has some connection with PCA (principal component analysis) technology, and PCA uses SVD to detect inter-data dependencies and redundant information.

Summary (Summary)

This article is very clear about the meaning of SVD, not only from the point of view of mathematics, but also linked to several examples of application of the image of SVD is how to find the main information in the data. In Netflix prize, many teams use matrix decomposition technology, which comes from the decomposition idea of SVD, which is the distortion of SVD, but the idea is consistent. Before being able to use the matrix decomposition technology in the personalized recommendation system, but the understanding is not intuitive, read the original clairvoyant, I want to be able to find the main information in the data of the idea, on several aspects to think about how to use the potential relationship in the data to explore the personalized recommendation system. Also hope to pass by the heroes to share it.

Turn! Singular value decomposition and geometric significance

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.