Collaborative filtering recommendation algorithm for implicit feedback behavior data"Collaborative Filtering for implicit Feedback Datasets" paper notes
This article is my note on the paper "Collaborativefiltering for implicit Feedback Datasets", which is about the implicit feedback behavior data collaborative filtering algorithm, which is based on the semantic model (LFM), the solution is ALS.
Explicit feedback behavior involves the user's understanding of the behavior of the item preference
Implicit feedback behavior refers to those who do not understand the reaction user preferences
's behavior. The most representative implicit feedback behavior is the page browsing behavior.
The need to consider implicit feedback:
There is a lot of application scenarios, and there is no explicit feedback. Because most of the users are silent users. Will not understand the system feedback "My preference for this item is how much".
Therefore, the recommendation system can be based on a large number of implicit feedback to determine the user's preference value.
However,explicit feedback is not always available.
Thus, Recommenders can infer user preferences from the more abundant
Implicit feedback, which indirectly reflect opinion through
Observing Userbehavior
Characteristics of implicit feedback:
1 No negative feedback.
Implicit feedback cannot infer whether or not you like it.
and explicit feedback, obviously can distinguish is like or not like.
2 Congenital with noise. The user buys an item and doesn't mean he likes it. Maybe it's a gift, maybe I didn't like it after I bought it.
3 The dominant feedback value represents the degree of preference, and the implicit feedback value represents the confidence level. The value of implicit feedback is usually the frequency of the action, the more frequency, does not mean that the greater the preference value. For example, a user often looks at a series, perhaps the user's preference for the series is generally. Just because of the weekly broadcast. So the frequency of action is very large. If the user is super fond of a movie. But it may have been seen once, so the frequency of action is not a big response to preference values.
From this user often see this series this behavior, only can judge that the user likes this series has the very big confidence, but this user to this series's preference value is how much we cannot evaluate.
4 implicit behavior requires an approximate assessment.
In explicit feedback, the preference value is expressed by Rui. Here, we use Rui to express the frequency of the action of implicit feedback.
WE Reserve Special indexing letters for distinguishing
Users fromitems:for users U, V, and for items I, J. The input
Data associate users and Items Throughrui values, which we
Henceforth call observations. For explicit feedback datasets,
Those values would be ratings that indicate the preference
By the user U of item I, where high values mean stronger preference.
For implicit feedback datasets, those values would
Indicate observations for user actions. Forexample, R UI can
indicate the number of times u purchased Item I or Thetime
u spent on webpage I .
The judgment of the preference value
The meaning of this expression is. User u to the item I implicit feedback action more than once, we think u like I. The more frequency of action. The higher the confidence level. If there is no action, you feel that the preference value of U to i is 0. Of course, because the number of actions is 0. So the "U-to-I preference value is 0" if the confidence level is very low.
Because there is no action. Besides not liking and not interested. There are many other reasons, such as u do not know the existence of I.
Measurement of confidence level
Cui = 1 +αrui
The paper mentions that the α= 40 effect is better.
Of course, there are other ways to measure confidence, in general, the greater the frequency of action, the higher the confidence.
In general, as Rui grows, we have a stronger indication
That the Userindeed likes the item.
Optimized target function:
This is similar to the target function for matrix decomposition optimization with explicit feedback, but with two different points:
1 We need to consider the confidence level.
2 The target of the optimization should be for all possible u,i key-value pairs, not just observable data. When the matrix decomposition optimization of dominant feedback is optimized, the missing data (unknown scoring) is not entered into the model and optimized for the known scoring data. And here is the implicit feedback. is to take advantage of all possible u,i key-value pairs, so the total data is m*n, where M is the number of users and N is the number of items. There is no so-called missing data, because if you do not have any action on I, we feel that the preference value is 0, just a low confidence level.
This was similar to matrix factorization techniques
Which is popular for explicit feedback data, with both important
Distinctions: (1) We need to account for the varying
Confidence levels, (2) optimization should account for all
Possible u, I pairs, rather than only those corresponding to
Observed data.
When optimizing this objective function. This is also the same as the dominant ALS matrix decomposition.
Fixed article characteristics, to the user special partial derivative, so that the partial derivative equals 0,
Update user characteristics
Fixed user characteristics, the partial derivative of the article, so that the partial derivative equals 0,
Update Item Features
This approach is called alternating least squares.
In this paper, we also introduce some techniques to speed up the calculation of the above matrix, in particular, can refer to the thesis.
"Collaborative filteringfor implicit Feedback Datasets"
Recommendation System Practice
Http://mahout.apache.org/users/recommender/intro-als-hadoop.html
The author of this article: linger
This article link: http://blog.csdn.net/lingerlanlan/article/details/46917601
Collaborative filtering recommendation algorithm for implicit feedback behavior data