When the novice touches the field of recommendation systems, the first thing to understand is that the collaborative filtering method is the most difficult one. Then if Baidu at this time, get the most is the singular value decomposition method, that is (SVD). The role of SVD is roughly to decompose a matrix into three matrix multiplication forms. If used in recommender systems, first we represent our training set as a matrix, and here we take the Movielen dataset as an example. This data set contains the user's rating of the movie. Then the matrix form is roughly:
|
Movie1 |
Movie2 |
Movie3 |
Moive4 |
User1 |
1 |
|
|
|
User2 |
2 |
|
|
3 |
User3 |
|
5 |
4 |
|
User4 |
2 |
|
|
4 |
One of them is the corresponding user rating of the film. The spare place indicates that there is no information for users and movies in the dataset. If we want to use SVD, we usually fill the blanks with 0. Assume that this matrix is v. Then using SVD can get
v = u σvt
But the above equals sign cannot be taken, only about equals. Then we will get the three matrices multiplied to get V ' (note differs from V). Then the original space (that is, 0 places) may no longer be 0, then this is the prediction of the user-moive pair. This is the main principle of SVD. Because SVD has a lot of out-of-the-box algorithms, and can be directly obtained without iteration, so it is more convenient to use.
But we will see that the above method has a fatal flaw, that is, the unknown score is all set to 0. This is actually extremely unreasonable, because the user does not give a film rating is not very dislike (0 points), but may have not seen the film. This adds to our judgmental message and finally makes a mistake. The solution is to use the matrix decomposition method of the implicit factor. Note that the matrix decomposition method and SVD have similar places also have different places, below I will be detailed introduction to the matrix decomposition method.
in matrix decomposition, there is a hypothesis that every user has a feature vector of length k, and each movie also has a feature vector of the same length (k generally needs to be specified by the user). Then all the user's eigenvectors are arranged into a matrix U with a dimension of Usernum * k. Where user I corresponds to the vector for the UI. The feature vectors of all movies are arranged in a matrix M dimension of Moivenum * k. Where the film J corresponds to a vector of UJ. Then the user I rated movie J Vij = <ui, mj> (<> for Point multiplication). Then the score between all users and all movies can be multiplied by two matrices:
v " = u mt
< Span style= "Font-family:arial, Helvetica, Sans-serif" > note that this is V ' instead of v. So the question comes, how do we determine this u and M? A natural idea is to make V ' and V as equal as possible. Then there is a problem, that is V (that is, the data set) there are many places without scoring, how to determine and V ' is equal? So here we only calculate the MSE with a rating. This does not use the additional information, but also to determine whether the two are close. Then naturally the introduction of our lost Function:
Here iij indicates that user I has scored records on movie J. The following two items are penalty factors to prevent overfitting. Then using the gradient descent method, we can get the values of U and M by iteration. Its pair of U and M are as follows:
Here we have completed the basic matrix decomposition method. So further, in order to achieve better results, we have to consider the impact of each individual scoring, because some users scored high, some users scored low. Same for the movie and all the ratings. So our scoring formula should be changed to:
Vij = <Ui,Mj> + Overall_mean + ai +BJ
Where Overall_mean is the average of all scores, AI is the average of user I scores, and BJ is the average of the movie J scores. Where Overall_mean is considered a constant, and both AI and BJ are parameters that need to be optimized. Here we do not give a derivative of their formula, I directly give the matrix form of the algorithm, easy to achieve concrete implementation. (INCOMING)
I wrote a python version of myself, interested to refer to https://github.com/ccienfall/RecommandSystem/blob/master/script/Factorize.py
References: "A Guide to Singular Value decomp osition for collaborative Filtering"
Recommendation system--matrix decomposition method of hidden factor