Common machine learning algorithms Principles + Practice Series 2 (SVD)

Source: Internet
Author: User

SVD singular value decomposition

With singular value decomposition singular value decomposition, we are able to represent the original dataset with a much smaller set of data, which can be understood in order to remove noise and redundant information. Suppose A is a m*n matrix, which can be decomposed into the following three matrices by SVD decomposition:

where u is the m*m matrix, the inside of the vector is orthogonal, the vector inside U is called the left singular vector, σ is a m*n diagonal matrix, the factors outside the diagonal are 0, diagonal are singular values, according to the large to small sort, and VT (v transpose) is an n * n matrix, the vector is also orthogonal, The vector inside V is called the right singular vector. So how do singular values and eigenvalues correspond? First, we will transpose a matrix A at * A, we will get the ATA is a square, we use this square to find the eigenvalues can be obtained:

The V we get here is the right singular vector above us. In addition we can also get:

The ōi here is the singular value mentioned above, the UI is the left singular vector mentioned above. It is common practice to arrange singular values from large to small. So σ can be determined by the M single. Singular value σ is similar to the eigenvalues, in the matrix σ is also from the large to the small arrangement, and σ reduction is particularly fast, in many cases, the first 10% or even 1% of the singular value of the sum of the total singular value of more than 99%. In other words, we can also approximate the matrix with the singular value of the former R large, which defines the partial singular value decomposition:

R is a number that is much smaller than M, N, so that the multiplication of the matrix looks like this:

The result of multiplying the three matrices on the right will be a matrix close to a, where R is closer to N, and the result of multiplying is closer to a. And the area of the three matrices (in the storage point of view, the smaller the size of the matrix, the less storage) is much smaller than the original matrix A, if we want to compress space to represent the original matrix A, we save the three matrix here: U, Σ, V is good.

Some of the SVD-related applications in Python are shown below:

The Linalg Toolbox in 1.numpy can directly implement SVD decomposition:

Where Sigma is designed to save space, it returns a vector instead of a matrix (diagonal only).

The following example illustrates the singular value of the top of the contribution value to indicate that the original matrix is essentially a loss-free

The utilization of 2,SVD in recommendation system

The recommendation system has many kinds of methods, and the collaborative filtering collaborative filtering is a mainstream algorithm, and the algorithm has two ways, one is based on the recommendation of the item, one is based on the user's recommendation. Based on the recommendation of the article, suppose we know the similarity of all films, and the user has not seen another movie, it can be recommended to the user by the highest similarity, based on the user recommendation is similar, if we know the similarity of all users, a user a saw the movie, User B, who is very similar to him, has not seen it, and can recommend it to B; the similarity algorithm used in this paper is usually European-style distance, Pearson coefficient or cosine similarity.

Assuming that a matrix A is established, the M*n matrix, the rows are all users, n is all items, each element of the matrix represents the user's rating of the item, then the item-based or user-based recommendation is to calculate the similarity of all columns or all rows. In real life, this matrix is very sparse.

Topic: Recommend users to buy TOPN items

The Matrix C is a m*n matrix, each row represents each user, each column represents a class of items, then the C (i,j) element represents the user I score for item J, assuming that the user has not scored an item (vector), So how to recommend something like this user (in fact, to predict the player's score on the items that have not been scored, to find out TOPN), the core algorithm is to identify user users rated items js (vector) score, and then find the entire network in the same time scored J and item users, Based on the user's score to form two vectors (respectively, the user for the item J and the item's rating list, which used the original large sparse matrix), and then find the similarity of the two vectors p, the final weighted forecast user rating for item: score_pre=p1*score_ User_j1+p2*score_user_j2+ .....

Suppose that the original sparse matrix is changed to the following (note the different um*m*sigmam*m*vtm*n with the original):

Xformeditem = datamat.t * U[:,:4] * sig4.i, i.e. m*n matrix becomes (n*m) * (m*4) * (4*4) = N*4 Matrix

Common machine learning algorithms Principles + Practice Series 2 (SVD)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.