A model of hidden factors based on matrix decomposition

Source: Internet
Author: User


Recommendation system is a kind of data analysis method which is widely used nowadays. Common AS, "you are concerned about the person also concerned about him", "like this item users also like." "You might like it," and so on.


Common referral systems are categorized as content-based recommendations and history-based recommendations.

Content-based recommendations , the key is to extract useful users, item information, as a feature vector to classify, regression.

based on the history of the recommendation , record the user's rating, click, collect and so on, to judge.

Content-based recommendations require a high level of information collection for user items, and in many cases it is difficult to get as much useful information. The history-based approach makes it easier to collect data from a number of common historical records compared to content-based methods.

Collaborative filtering is widely used in recommender systems. The general approach is to use similarity metrics to get similar sets of users, or similar collections of items, and then recommend them accordingly.

Amazon's book recommendation system is a recommendation based on item similarity, "I guess you also like * * items".

However, the simple collaborative filtering effect is not very good, we or consider the user clustering, to get user-based collaborative filtering, or only consider the item clustering, to get the object-based collaborative filtering.


An implicit factor model based on matrix decomposition (SVD) was proposed (latent Factor models).

By assuming a hidden factor space, the implicit factor model obtains the category matrix of the user, the item, and then obtains the final result by multiplying the matrix. In practice, the effect of LFM is higher than the general collaborative filtering algorithm.

1. Basic methods of LFM

We use user1,2,3 to represent the user, item Rij represents the item, and the user I scores the item J, that is, the preference degree. Then we need to get a two-dimensional matrix about the user-item, like the R below.


In a common system, r is a very sparse matrix, because we can't get all the users to score all the items. So the use of sparse R, fill to get a full matrix R ' is our purpose.

In collaborative filtering, we usually assume that some users, or some items belonging to a type, are recommended by type. Here, we can also assume the class, or the factor (factor). We assume that the user has a certain degree of preference for a particular factor, and that the item has a certain degree of inclusion in the particular factor.

For example, the user's taste for comedy, martial arts is 1, 5, and the item for comedy, the inclusion of martial arts is 5, 1; Then we can probably judge that the user will not like the film.

That is, we artificially abstract a hidden factor space, and then project the user and objects into this space, to directly find the user-item preferences.

A simple two-dimensional implicit factor space diagram is as follows:


Pictured above is a man-woman; two dimensions as a hidden factor, the user and the movie are projected onto the two-dimensional space.

The above question, which we describe in a mathematical way, is written in the following matrix:

P indicates the user's preference for an implicit factor, and Q indicates the degree of inclusion of an item for an implicit factor. We use matrix multiplication to get user-item preference.

As mentioned above, R is a sparse matrix, we get p,q by the known values in R, then multiply, in turn, fill the R matrix, and finally get a full R matrix.


Therefore, the implicit factor model is transformed into matrix decomposition problem, and there are some common SVD and some methods.

Here are the specific methods

2. Batch Learning of SVD

The known scoring matrix V,i is the index matrix, and I (i,j) =1 indicates that the corresponding element in V is known. U,m respectively represents user-factor, item-factor matrix.

So, we first use the V decomposition to u*m, the objective function is as follows:


The first is the least squares error, and p can be simply understood as the point multiplication;

Second, the third item is to prevent the regularization of overfitting.

To solve the above optimization problem, the gradient descent method can be used. The calculated negative gradient direction is as follows:


Each time we iterate, we first calculate the negative gradient direction of the u,m, then update the u,m, and iterate until the convergence.

The disadvantage of this method is that for large sparse matrices, there is a large variance and a small convergence rate can guarantee convergence.

Improvement: Consider adding a momentum factor to accelerate its convergence rate:


3. Incomplete incremental learning of SVD

The above method is not a good method for large sparse matrices.

Thus, we refine the solution process.

The improved optimization objective function is as follows:


That is, we use the Unit of Action of V to optimize each line each time, thus reducing the variance of batch learning.

Negative Gradient direction:

4. Complete incremental learning of SVD

Similarly, based on Incrementlearning's thought of reducing variance, we can refine the solution process again.

The solution is based on the known element of V.

The optimization objective function is as follows:

For each iteration, we iterate through the known elements in each V, and obtain a negative gradient direction, and u,m;

The implicit factor model also has corresponding other variations, such as compound svd,implicit feedback SVD, etc., placed in the next blog.

References: A guides to Singular Value decomposition for collaborative Filtering


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.