Recommendation System-matrix decomposition of hidden factors and hidden factor matrix decomposition

Source: Internet
Author: User

Recommendation System-matrix decomposition of hidden factors and hidden factor matrix decomposition

When new users come into contact with the recommendation system field, the first thing that is difficult to understand is the collaborative filtering method. In this case, Baidu will obtain the most Singular Value Decomposition Method (SVD ). SVD is used to divide a matrix into three matrices and multiply them. If it is used in the recommendation system, we first represent our training set as a matrix. Here we take the movielen dataset as an example. This dataset contains the user's rating on movies. The matrix format is roughly as follows:

  Movie1 Movie2 Movie3 Moive4
User1 1      
User2 2     3
User3   5 4  
User4 2     4

1 ~ 5 is the rating of the corresponding user on the movie. In the free space, the dataset does not contain information about users and movies. If we want to use SVD, we usually enter 0 in the free space. If this matrix is V, we can use SVD to obtain

                          V=UΣVT

  However, the equal sign above cannot be obtained and can only be approximately equal to or equal. Then we multiply the three matrices to get V' (note that it is different from V ). Then the original blank space (that is, 0) may no longer be 0, so this is the prediction of this user-moive pair. This is the main principle of SVD. Because SVD has many existing algorithms that can be directly obtained without iteration, it is easy to use.

 

However, we can see that the above method has a fatal defect, that is, to set all unknown scores to 0. this is actually extremely unreasonable, because users do not like a movie (0 points), but may not have watched it. In this way, we added our subjective hypothetical information and finally caused errors. The solution is to use the matrix factorization method of the hidden factor. Note that there are also differences between the matrix decomposition method and the SVD method. I will introduce the matrix decomposition method in detail below.

In the matrix decomposition method, there is a assumption that every user has a feature vector with a length of k, each movie also has a feature vector of the same length (k usually needs to be specified by the user ). Then all user feature vectors are arranged into a matrix U with the dimension UserNum * k. The vector corresponding to user I is Ui. The feature vectors of all movies are arranged into a matrix. The dimension of M is MoiveNum * k. The vector corresponding to movie j is Uj. In this case, user I scores movie j by Vij = <Ui, Mj> (<> representing dot multiplication ). Then the scores between all users and all movies can be obtained by multiplying the two matrices:

                          V'=UMT

  Note that this is V, not V. So the question is, how can we determine the U and M? A natural idea is to make V' and V equal as much as possible. There is another problem, that is, V (that is, the dataset) is not rated in many places. How can we determine whether it is equal to V? Here, we only calculate the MSE with a score. In this way, no additional information is used, and the two are close to each other. Naturally, we have to introduce our lost function:

      

HereIIj indicates that user I has a score record for movie j. The following two items are punitive factors to prevent overfitting. Then, by using the gradient descent method, we can obtain the U and M values through iteration. Its Requirements for U and M are as follows:

Now we have completed the basic matrix decomposition method. Further, to achieve better results, we need to consider the impact of each individual score, because some users have a high score and some users have a low score. The same for movies and all ratings. Therefore, the formula for calculating the score should be changed:

Vij = <Ui, Mj> + overall_mean + ai + bj

Overall_mean indicates the average of all ratings, ai indicates the average of user I ratings, and bj indicates the average of movie j scores. Overall_mean is a constant, and both ai and bj are parameters to be optimized. Here we will not give a formula for their derivation. I will directly give an algorithm in the matrix form for your specific implementation. (INCOMING)

 

I wrote a python version myself. If you are interested, please refer to the https://github.com/ccienfall/RecommandSystem/blob/master/script/Factorize.py.

In the 2016 Byte Cup International Machine Learning competition, SVD and matrix decomposition (MF) are used respectively. The final result is that the MF method is about 20% better than the SVD method.

References: A Guide to Singular Value Decomp osition for Collaborative Filtering

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.