Ranking algorithm based on user voting (i): Delicious and hacker

Source: Internet
Author: User
Tags ticket

Author: Ruan Yi Feng

Date: February 24, 2012

The advent of the internet means "information explosion".

Users worry that information is not too little, but too much information. How to find the most important content quickly and effectively from a lot of information becomes a core problem of the Internet.

A variety of ranking algorithms, is currently filtering information one of the main means. Ranking information means that information is ranked in order of importance and updated in a timely manner. The basis of the arrangement, can be based on the characteristics of the information itself, but also based on the user's vote, that is, let the user decide, what information can be ranked in the first place.

Below, I will collate and analyze some ranking algorithms based on user voting, intending to split into six parts, today is the first one.

First, Delicious

The most intuitive and simplest algorithm is to rank the number of votes per user in a unit of time. The items that get the most votes are naturally ranked first.

The old version of the delicious, there is a "top bookmark leaderboard", that is the statistics.

It ranks in the "number of collections in the last 60 Minutes". It is counted once every 60 minutes.

The advantage of this algorithm is relatively simple, easy to deploy, the content update is quite fast, the disadvantage is that, on the one hand, the ranking change is not smooth enough, the first one hours is also top of the content, often the second hour on the plummeted, on the other hand, the lack of automatic elimination of old project mechanism, some popular content may occupy the forefront of

Second, Hacker News

Hacker News is an online community that can post links or discuss a topic.

There is an upward triangle in front of each post, and if you think this content is good, click on it and vote on it. According to the number of votes, the system automatically counted the top article rankings. However, not the most votes in the article ranked first, but also to consider the time factor, the new article should be easier than the old article to get a good ranking.

Hacker News was written using the arc language developed by Paul Graham and the source code can be downloaded from arclanguage.org. Its ranking algorithm is implemented in this way:

Revert the above code to a mathematical formula:

which

P indicates the number of votes in the post, minus 1 to ignore the poster's vote.

T is the time to post (in hours), plus 2 to prevent the most recent posts from causing the denominator to be too small (2 is chosen, possibly because it takes two hours from the original article to the other site, to the hacker News).

G represents the "Gravity factor" (Gravityth power), the strength of the post ranking downward, the default value is 1.8, this value will be discussed in detail later in this article.

Judging from this formula, there are three factors that determine the ranking of posts:

The first factor is the number of votes p.

In the case of other conditions, the higher the number of votes, the higher the ranking.

As can be seen from the above figure, there are three simultaneous posts, with a total of 200 votes, 60 votes and 30 votes (1, 199 and 59 minus 29), respectively, in yellow, purple and blue. At any point in time, the yellow curve is at the top, and the blue curve is at the bottom.

If you don't want the gap between "high-ticket" and "low-ticket" too big, you can add a less than 1 index to the number of votes, such as (P-1) ^0.8.

The second factor is the time t from the post.

In the case of other conditions, the more newly posted posts, the higher the ranking. Or, the ranking of a post will continue to fall over time.

As you can see from the previous picture, after 24 hours, all posts scored basically less than 1, which means they will all fall to the end of the leaderboard, guaranteeing that the top-ranked will be the newer content.

A third factor is the gravity factor G.

Its numerical size determines the rate at which the rankings fall over time.

As you can see from the above figure, the other parameters of the three curves are the same, and the values for G are 1.5, 1.8, and 2.0, respectively. The larger the G value, the steeper the curve, and the faster the rankings fall, which means the leaderboard is updated faster.

Knowing the composition of the algorithm, you can adjust the value of the parameter to suit your own application.

[References]

* How Hacker News ranking algorithm works

* How to Build a popularity algorithm your can be Proud of

(end) Document Information copyright notice: Free Reprint-Non-commercial-non-derivative-maintain attribution (Creative Commons 3.0 License) Published: February 24, 2012 Read more: Archives» Algorithmic and Mathematical Buying anthology: How to Be thought social media: TWI Tter, Weibo feed subscription:

related article 2015.09.01: Understanding matrix multiplication Most people in high school, or early in college, have a course, "linear algebra." This course is actually a teaching matrix. 2015.07.27: Introduction to Monte Carlo method the Monte Carlo method is introduced in this paper through five examples. 2015.06.10: Poisson distribution and exponential distribution: 10-minute tutorials when I was in college, I always thought statistics were difficult and almost hung up. 2013.12.16: Application of naive Bayesian classifier There are many occasions in the life that need classification, such as news classification, Patient classification and so on. message (33)

Alex says:

I'll write Reddit.

February 24, 2012 22:44 | ∞| Reference

Evan says:

Simple is Beauty
This simple formula seems to be "unworthy" to write a paper in a scientific institution, but simplicity is the beauty

February 24, 2012 23:13 | ∞| Reference

Gower says:

。 To open a Web page with chrome, there are warning alerts as follows:

Warning: This page contains dangerous content.
Www.ruanyifeng.com contains content from bamosa.ru, and the latter site is a site that is known to distribute malicious software. If you visit this website, your computer may be poisoned.
Google finds that if you continue, malware may be installed on your computer. If you have previously visited this site or you trust this site, this site may have just been hacked. You should not continue, and it is recommended that you retry or visit another website tomorrow.

February 25, 2012 01:27 | ∞| Reference

Liu Yongwin says:

Support Mr. Ruan, continue to study.

February 25, 2012 08:04 | ∞| Reference

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.