Recommender System-Collaborative Filtering

Source: Internet
Author: User

1. Overview

Collaborative FilteringMethods are based onCollecting and analyzingA large amount of information onUsers 'behaviors, activities or preferencesAnd predicting what users will likeBased on their similarity to other users.

By collecting and analyzing a large number of user behaviors, activities, and scoring records, we can find other users with similar interests to this user. By using behavior records of other users, we can predict what users will like.

AKey advantage of the collaborative filtering approachIs that it does not rely on machine analyzable content and thereforeIt is capable of accurately recommending complex itemsSuch as moviesWithout requiring an "Understanding" of the item itself.

The biggest advantage of collaborative filtering is that it analyzes the user's dislike and dislike of things through user behavior, without the needAlgorithmTo "understand" what a thing is.

Many algorithms have been used inMeasuring user SimilarityOr item similarity in recommender systems. For example,K-Nearest Neighborhood (k-nn) ApproachAndPearson correlation.

There are many algorithms that can be used to "Measure" the degree of interest similarity between users, such as K-NN algorithms, Pearson correlation coefficient and so on.

2. Data collection

To perform collaborative filtering, You need to collect user data. There are two ways to collect user data:Explicit collection)AndImplicit data collection). That is, there is dark.

Methods such:

Asking a user to rate an item on a sliding scale.
Asking a user to rank a collection of items from favorite to least favorite.
Presenting two items to a user and asking him/her to choose the better one of them.
Asking a user to create a list of items that he/she likes

The method is as follows:
Observing the items that a user views in an online store.
Analyzing item/user viewing times [12]
Keeping a record of the items that a user purchases online.
Obtaining a list of items that a user has listened to or watched on his/her computer.
Analyzing the user's social network and discovering similar likes and dislikes

I personally prefer a method that does not require additional work. However, this method often involves privacy issues, so it also has drawbacks.

3. Data Analysis Methods

The Recommender SystemComparesTheCollected DataToSimilar and dissimilar data collected from othersAnd calculates a list of recommended items for the user.

Compare the collected user a data with other user data similar to user a and non-similar to user a to get a list of recommended items. Examples:

One
Of the most famous examples of collaborative filtering is item-to-item
Collaborative Filtering (people who buy X also buy), an algorithm
Popularized by Amazon.com's recommender system.

A famous example of collaborative filtering is the collaborative filtering of items to items, that is, "users who buy a usually buy B". This is an algorithm launched by Amazon.

Other examples include: other algorithms include
Last. FM recommends music based on a comparison of the listening habits of similar users.

Last. FM recommends music for users by comparing the listening list of similar users.


Facebook, MySpace, LinkedIn, and other social networks use
Collaborative Filtering to recommend new friends, groups, and other
Social connections (by examining the network of connections between
User and their friends ).

SNS such as Facebook use collaborative filtering to recommend new friends to users by detecting the user's circle of friends to find similar user groups for recommendations.

4. Problems with collaborative filtering

Collaborative Filtering approaches often suffer from three problems:Cold start (cold start),Scalability), AndSparsity (sparsity).

Refer:Sanghack
Lee And Jihoon Yang and copyright-yong Park, discovery of hidden Similarity
On Collaborative Filtering to overcome sparsity problem, Discovery
Science, 2007.

Cold start: These systems often require a large amount of existing data on a user in order to make accurate recommendations.

Cold start: Recommendation systems generally require a large amount of existing data for accurate recommendation. The definition of cold start in Wikipedia includes the following:

" it concerns the issue that the system cannot draw any inferences for users or items about which it has not yet gathered sufficient information . "in the recommendation system, the solution for Cold Start is:" In recommender systems, the cold start problem is often protected ced by adopting a hybrid approach between content-based matching and collaborative filtering . new items (which have not yet got Ed any ratings from the Community) wocould be assigned a rating automatically, based on the ratings assigned by the Community to other similar items. item similarity wocould be determined according to the items 'content-based characteristics ", that is, when no user rating is available, automatically pre-deliver similar products Give a score, and what products are similar to it? The method of determination uses the content-based algorithm. In this way, collaborative filtering and content-based methods are mixed.

Scalability: In versions of the environments
That these systems make recommendations in, there are millions of users
And products. Thus, a large amount of computation power is often
Necessary to calculate recommendations.

Scalability: In the recommendation system environment, there are usually a large number of user product data. Therefore, in order to calculate the recommendation list, a huge amount of computing power is required.

Sparsity:
The number of items sold on major e-commerce sites is extremely large.
The most active users will only have rated a small subset of the overall
Database. Thus, even the most popular items have very few ratings.
Many products are sold on the main e-commerce website. Even some very active users can only evaluate some of the items, so the overall evaluation rate of the items is very low. Therefore, a sparse matrix computing problem exists during computing.

A Participating Type of collaborative filtering algorithm uses matrix factorization, a low-rank matrix approximation technique.

Therefore, a special type of collaborative filtering algorithm usesMatrix decomposition,Low rank matrix approximation Technology.

Refer:

I. markovsky, Low-Rank approximation: algorithms, implementation, applications, Springer, 2012, ISBN 978-1-4471-2226-5

Takács,
G.; Pil ászy, I.; n' meth, B.; tikk, D. (March 2009). "scalable
Collaborative Filtering approaches for large recommender systems ".
Journal of machine learning research 10: 623-656

Rennie,
J.; srebro, N. (2005). "Fast maximum margin matrix factorization
Collaborative prediction ". In Luc de raedt, Stefan Wrobel (PDF ).
Proceedings of the 22nd Annual International Conference on Machine
Learning. ACM press.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.