The difference between cosine similarity, Pearson coefficient and modified cosine similarity in object-based collaborative filtering

Source: Internet
Author: User
Tags comparison

Suppose the data is as follows, where the row represents the user, and the column represents the rating item:


Let's look at the three formulas first.

Cosine similarity (cosine-based similarity):


Pearson coefficient (Pearson correlation):


Fixed cosine similarity (adjusted cosine similarity):


Where ru,i represents the user U gives the item I rating


1. Comparison of cosine similarity with the rest

The cosine similarity calculation is based on the information of all users in the rating item I and item J, which includes all users who have filled in the rating with the No-fill rating (0 without filling in the rating);

The Pearson coefficient and the modified cosine similarity represent all combinations of users who have rated I and J together ;

Summary: the cosine similarity differs from the rest of the user collections that are selected in the calculated formula.


2. Comparison between Pearson's coefficient and modified cosine similarity

from the formula, the difference between the two is the difference between.

The Pearson coefficient represents all users who have been rated I and J, the average of their ratings for I, that is, when the Pearson coefficients are computed, a table of users listed as I and J, which behaves in the same rating , and calculates the average of the column I.

The corrected cosine similarity represents the average value of the user U- rated items, i.e., items that are not rated when calculated are not taken by 0 but are ignored directly.

Summary: The difference between the Pearson coefficient and the modified cosine similarity is in the different ways of centering .


Reference article:

1.http://www.zhihu.com/question/21824291

2.http://www10.org/cdrom/papers/519/node11.html

3.http://guidetodatamining.com/assets/guidechapters/datamining-ch3.pdf


if there are errors or suggestions please advise, O (∩_∩) o Thank you

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.