Thesis Reading Notes-personalized news recommendation based on click Behavior

Source: Internet
Author: User

Note: This paper mainly introduces Google's system for news recommendation based on users' click behaviors. The general process of this system is as follows: first, it analyzes a large number of click logs, then, the user's news preferences are predicted based on Bayesian Networks. Content-based recommendations and collaborative filtering are used in the prediction to achieve personalized news recommendations.

The main points are as follows:

1. Original collaborative filtering defects:

If you only rely on collaborative filtering, each news item must be clicked by other users before corresponding recommendations can be made. However, news are generally updated in real time and cannot be clicked in time. At the same time, not all users are the same. Collaborative Filtering does not take into account the differences between users.

2. Log Analysis

News is different from searching. Searching has a specific purpose. News only provides users with something they are interested in, and their news preferences will change constantly. When analyzing logs, news can be divided into several categories: category, c = {C1, c2..cn}, d (u, T) indicates the click distribution of user U in the nth month:


In this example, ni indicates the number of times the user clicks in the T-month in the CI category. ntotal indicates the number of times the user clicks during this period. d (u, T) shows the proportion of users in various news categories.

3. user preferences

If the user's preferences remain unchanged, the click distribution will remain unchanged, and the differences between user preferences can be calculated:

The results show that user preferences are greatly changed, which indicates that the old access records of users are of little significance for future prediction.

4. Changes in news trends

D (t) is used to indicate the click distribution of a country within t time. In addition, the news trend changes greatly when a major event occurs, news trends vary in different countries, because people in different countries have different preferences. News trends may affect user preferences. In comparison, user preferences in the same region are similar.

5. Four Conclusions

-The user's personal preferences will indeed change over time

-The mass click distribution reflects the news trend, which is generally related to the occurrence of major events.

-Different news trends exist in different places

-To a certain extent, the news preferences of individual users are consistent with those of local news trends

6. Bayesian framework

We divide user preferences into two parts: one is the true user preferences and the other is the influence of local news trends. The real user preferences are long-term and directly related to the user, reflecting the user's personal characteristics. The influence of local news trends is short-term and changes rapidly. The specific method is as follows: first, do not consider the news trend, and deduce the real user preferences based on the user's access records at each stage; secondly, we can combine the predictions of all stages to get a more accurate user's real preferences. Finally, we can predict users' current preferences based on the obtained user's real preferences and local news trends.

Obtain the following formula using Bayesian formula:

Pt (Category = CI | click) indicates the probability that a user clicks a class CI, which can be obtained by d (u, t. Pt (Category = CI) is the prior probability of news classified as Class CI, and Pt (click) is the prior probability of user clicks. P (Click | Category = CI) indicates the probability of a user clicking on a category CI news. Therefore, we can get users' preferences for category CI news:


We assume that the prior probability of user clicks remains unchanged:

Now, considering the influence of the local news trend, we can think that the click distribution within a short period of time reflects the local news trend, expressed by P0 (Category = CI:


It can be assumed that the probability of a user clicking News is fixed:

To smooth, add some virtual click data:

The advantage of this approach is that if the user's click rate is low, the system will recommend it based on the current news trend. At the same time, user preferences can be constantly updated.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.