Principle and example of collaborative filtering based on user and project

Last Update:2015-04-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. User-based collaborative filtering

The user (user-based)-based collaborative filtering algorithm first looks for other users who are similar to the new user based on the user's historical behavior information, and predicts the items that the current new user might like based on the evaluation information of the other items by these similar users. Given the user scoring data matrix R, the user-based collaborative filtering algorithm needs to define the similarity function s:uxu→r to calculate the similarity between users, and then calculate the recommended results based on the scoring data and the similarity matrix.

In collaborative filtering, an important link is how to choose the appropriate similarity calculation method, the two commonly used similarity calculation methods include Pearson correlation coefficient and cosine similarity. The calculation formula for Pearson's correlation coefficients is as follows:

Where I represents an item, such as a commodity; IU represents the set of items that user U evaluates, IV represents the set of items for user v evaluation, Ru,i represents the user's score for item I, rv,i represents the user V's rating of item I, represents the average score for user v.

In addition, the cosine similarity calculation formula is as follows:

Another important step is to calculate the user U's forecast score for the outstanding items. First, based on the similarity calculation in the previous step, look for the neighbor set n∈u of user U, where n represents the neighbor set and U represents the user set. Then, in combination with the user scoring dataset, predict user U's scoring of item I, the formula is as follows:

where S (U, U ') represents the similarity of user u and user u '.

Suppose there is an e-commerce scoring dataset that predicts user C's rating for item 4, as shown in table 3-6.

Table 3-6 e-commerce website user ratings data set

In the table? Indicates that the rating is unknown. Based on the user-based collaborative filtering algorithm step, calculate user C's rating for item 4, as shown in the steps below.

(1) Find the neighbor of User C

As you can see from the data set, only user A and user D are overly good at item 4, so there are only 2 candidate neighbors, user A and User D, respectively. User A has an average rating of 4, User C has an average rating of 3.667, and User D has a average rating of 3. According to the Pearson correlation coefficient formula, the similarity of user C and user A is:

Similarly, S (C, D) =-0.515.

(2) Predict User C's rating for item 4

Based on the above scoring prediction formula, the user C rating for Item 4 is calculated as follows:

And so on, you can calculate other unknown scores.

2. Project-based collaborative filtering

The collaborative filtering algorithm based on the project (item-based) is another common algorithm. Unlike the user-based collaborative filtering algorithm, the item-based collaborative filtering algorithm calculates the similarity between Item to predict user ratings. This means that the algorithm can pre-calculate the similarity between the item, which can improve performance. The item-based collaborative filtering algorithm is used to predict the target item by the user scoring data and the calculated item similarity matrix.

Similar to the user-based collaborative filtering algorithm, the similarity between item needs to be calculated first. Moreover, the method of calculating similarity can also use Pearson relation coefficient or cosine similarity, here gives an electronic commerce system common similarity computation method, namely calculates the similarity degree between item based on conditional probability, the formula is as follows:

wherein, S (i, j) represents the similarity between the item I and J, Freq (IJ) represents the frequency of the common occurrence of I and J, Freq (i) indicates the frequency of the occurrence of I, Freq (j) represents the frequency of J appearance, and the resistance factor, which is mainly used to balance control of popular and popular item, For example, e-commerce in the hot goods and so on.

Next, based on the similarity matrix between the item calculated above, the unknown score is predicted based on the user's score. The prediction formula is as follows:

Wherein, PU, I represents the user u to the item I's prediction score, s represents and item I similar itemsets, S (i, j) represents the similarity between the item I and J; Ru, J represents the user U's rating for item J.

Principle and example of collaborative filtering based on user and project

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Principle and example of collaborative filtering based on user and project

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Principle and example of collaborative filtering based on user and project

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support