Collaborative filtering recommendation algorithm for implicit feedback behavior data"Collaborative Filtering for implicit Feedback Datasets" paper notes
This article is my note on the paper "Collaborativefiltering for implicit Feedback Datasets", which is about the implicit feedback behavior data collaborative filtering algorithm, which adopts the metaphorical semantic model (LFM), and the solution method is ALS.
Explicit feedback behavior includes the behavior that the user explicitly represents to the item preference
Implicit feedback behavior refers to those who do not explicitly respond to user preferences
's behavior. The most representative implicit feedback behavior is the page browsing behavior.
The need to consider implicit feedback:
Many application scenarios, and there is no explicit feedback exists. Because most users are silent users, they do not explicitly give feedback to the system "what is my preference for this item". Therefore, the recommendation system can infer the user's preference value based on a large amount of implicit feedback.
However,explicit feedback is not always available.
Thus, Recommenders can infer user preferences from the more abundant
Implicit feedback, which indirectly reflect opinion through
Observing Userbehavior
Characteristics of implicit feedback:
1 No negative feedback. Implicit feedback cannot be judged whether you dislike it or not. and explicit feedback, obviously can distinguish is like or not like.
2 Congenital with noise. The user buys an item, does not mean that he likes, perhaps is the gift, perhaps buys after found does not like.
3 The dominant feedback value represents the degree of preference, and the implicit feedback value represents the confidence level. The value of implicit feedback is usually the frequency of the action, the more frequency, does not mean that the greater the preference value. For example, a user often watch a series, it may be the user's favorite value of the series, just because the weekly broadcast, so the frequency is very large, if the user to a movie super like, but may have seen once, so the frequency of action is not a big response to the preference value. From this user often see this series of behavior, can only infer that the user likes this series has a great confidence, but the user's preference for this series is how much we can not evaluate.
4 implicit behavior requires an approximate evaluation.
In explicit feedback, the preference value is expressed by Rui. Here, we use Rui to express the frequency of the action of implicit feedback.
WE Reserve Special indexing letters for distinguishing
Users fromitems:for users U, V, and for items I, J. The input
Data associate users and Items Throughrui values, which we
Henceforth call observations. For explicit feedback datasets,
Those values would be ratings that indicate the preference
By the user U of item I, where high values mean stronger preference.
For implicit feedback datasets, those values would
Indicate observations for user actions. Forexample, R UI can
indicate the number of times u purchased Item I or Thetime
u spent on webpage I .
Inference of preference values
This means that the user U to the item I implicit feedback action more than once, we think you like I, the more frequency of action, this hypothesis the higher the confidence level. If there is no action, you think that the U-I preference value is 0, of course, because the number of actions is 0, so the "U-I preference value is 0" the assumption that the confidence level is very low. Because there is no action, there are many other reasons besides dislike and not interested, such as u do not know the existence of I.
Measurement of confidence level
Cui = 1 +αrui
The paper mentions that the α= 40 effect is better.
Of course, there are other ways to measure confidence, in general, the greater the frequency of action, the higher the confidence.
In general, as Rui grows, we have a stronger indication
That the Userindeed likes the item.
Optimized target function:
This is similar to the target function for matrix decomposition optimization with explicit feedback, but with two different points:
1 We need to consider the confidence level.
2 The target of the optimization should be for all possible u,i key-value pairs, not just observable data. When the matrix decomposition optimization of dominant feedback is optimized, the missing data (unknown scoring) is not entered into the model and optimized for the known scoring data. The implicit feedback here is that all possible u,i key pairs are used, so the total data is m*n, where M is the number of users and N is the number of items. There is no so-called missing data, because if you don't have any action on I, we think the preference value is 0, but the confidence level is low.
This was similar to matrix factorization techniques
Which is popular for explicit feedback data, with both important
Distinctions: (1) We need to account for the varying
Confidence levels, (2) optimization should account for all
Possible u, I pairs, rather than only those corresponding to
Observed data.
When optimizing this objective function, it is still the same as the dominant ALS matrix decomposition.
Fixed article characteristics, to the user special partial derivative, so that the partial derivative equals 0,
Update user characteristics
Fixed user characteristics, the partial derivative of the article, so that the partial derivative equals 0,
Update Item Features
This approach is called alternating least squares.
In this paper, we also introduce some techniques that can speed up the calculation of the above matrix, and can be viewed in detail in this paper.
"Collaborative filteringfor implicit Feedback Datasets"
Recommendation System Practice
Http://mahout.apache.org/users/recommender/intro-als-hadoop.html
The author of this article: linger
This article link: http://blog.csdn.net/lingerlanlan/article/details/46917601
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Collaborative filtering recommendation algorithm for implicit feedback behavior data