Object-based collaborative filtering algorithm for recommendation algorithm

Source: Internet
Author: User

The object-based collaborative filtering algorithm (ITEMCF) is the most widely used algorithm in the industry, the main idea is to take advantage of the user's previous behavior, to recommend to users and items similar to previous items.

The object-based collaborative filtering algorithm is divided into two main steps:

1) Calculate the similarity between items.

2) generate a referral list based on the similarity of the item and the user's historical behavior.

The key point of the first step is to calculate the similarity between the items, here does not use the content-based similarity, but to calculate how many of the users who like the item I like the item J, so the premise is that the user's interests are usually more certain, not easy to change, then when a user to two items like the time , we can often feel that these two items may belong to the same classification. So that n (i) indicates the number of users who purchased item I, the similarity between item I and item J can be used Wij = | N (i) &n (j) |/n (i) to calculate.


The first step of the time complexity improvement method: Similar to USERCF, we can create a user-item Reverse table, so that each time to calculate a user has behavior of those items between the similarity, can ensure that the calculation of the similarity is practical, without having to spend a large amount of computation on those 0 (must be a sparse matrix)

The first step to improve the similarity of the method 1: If based on the above formula to calculate the similarity, you will find that the item I and popular items J similarity is very high, because of the popularity of reading high, so basic everyone will buy, this kind of high-prevalence items are less differentiated, so we need to punish the weight of popular items J Wij = | N (i) &n (j) |/sqrt (n (i) *n (j))

First step Similarity Improvement Method 2: The user's activity needs to be punished. If the user activity is relatively low, only to buy a limited number of books, then these books are very likely in one or two areas of interest, to calculate the similarity of items is more practical, but suppose that a bookstore sellers take advantage of the discount to the Amazon 90% Books are bought and then make a difference, Then the user's behavior to calculate the similarity of the item does not work, because 90% of the book will certainly cover a very wide range, it should be like the improvement method one to punish the user's activity.

The first step Similarity improvement Method 3: The item similarity is normalized. Normalization does not only increase the accuracy of recommendations, but also increases the recommended coverage and diversity. For example, on Amazon, the interests of users must be divided into several categories, very little to say that hobbies are concentrated in one category. If there is a similarity between the two classes A and B,a class is 0.5, the similarity between Class B is between 0.8,a and B is 0.2, when the user buys 5 books of Class A and 5 books of Class B, we will give the user a recommendation, if according to the previous method, finally according to the similarity of the order, So the recommendation should be a B-class items, even if the class B ranked relatively low, but the same is higher than a class, so should be based on the classification of similarity, so that a similarity of 1,b is similar to 1, this sort of recommendation A, B products have, greatly improved accuracy, coverage and diversity.

The second step is simpler, calculating the similarity (weight and) of the items to the user's purchases, and then sorting the TOPN according to the similarity.


ITEMCF in the actual system, the use of more than two main advantages:

1) Item-item table compared to square user-user table is much smaller, easy to deal with

2) Itemcfeasy provides a recommendation, for example, to recommend "machine learning" because you have previously bought "data mining", so as to add trust, improve user and referral system interaction, and further enhance personalized recommendations

Object-based collaborative filtering algorithm for recommendation algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.