Zheng Yi 20090905
In the social media field, no matter whether it is search results or page display, as long as it is not edited or selected, as long as it is determined by machine intelligence, it must be arranged in a certain order.
In addition to the time order or the number of votes, what other effective display modes will be available?
The following lists what I have seen:
Mode 1: Reddit Mode
Reddit's sorting algorithm once introduced that Reddit takes into account the following factors:
- ArticleDegree of freshness;
- Supported votes and opposing votes;
- Discoverers and followers effects (reducing the voting weights of followers ).
Figure 1 Reddit sorting example
It can be seen that it is very important to allow fresh articles with insufficient votes to quickly break into the list.
The spear mode in the five main methods of finding experts from the massive data volumes of social media believes: "experts should be discoverers, not followers of trends. Experts should be the first batch of people to collect and Mark high-quality articles, so as to summonCommunityOther users. The earlier the user discovers high-quality content, the higher the professional level of the user. Therefore, we need to distinguish "discoverers" from "followers "." Reddit uses log10 to allow earlier voting (discoverers) to gain greater weight. For example, the weights of the first 10 votes are the same as those of the 11-101 votes.
As you know, playgroup SR also uses the same sorting rules when giving popular links. I have provided simplified algorithms.
Mode 2: oneriot pulserank Mode
Real-time search engine oneriotPulserankTo fully consider the social factors, so that the search results can be sorted by socially relevant.
PulserankFactors to consider:
- Freshness freshness;
- Domain authority: different teams may have different opinions on whether the domain name of a traditional portal has a higher weight or the domain name of an independent blog is more valuable.
- Recommender weight: the system must be able to identify whether the recommender is spammer, you need to find that some recommenders always recommend the Link under the same link or domain name (you always recommend the link of a website day after day, and you should reduce your weight ), we also need to find that some people's recommendations can always get a larger range of "secondary dissemination ".
- Propagation acceleration: Mainly checks the recommendation rate to distinguish between new pages and well-known popular pages.
Of course, it also considers the number of recommendations from Twitter, Digg, and oneriot share.
The more recommendations, the higher the probability of ranking first in the pulse search results; the greater the freshness, the less noticeable the effects of other factors. This is an enhanced version of the Reddit mode, but aggregates the number of recommendations for different social sites and adds several factors.
Reference resources:
1. ranking algorithm for the realtime Web: oneriot "pulse rank" update
Mode 3: Digg Mode
Digg has many tips:
1. Speed of voting: for example, 40 to 40 minutes can be collected quickly in the first half of an article ~ If there are 50 votes, it doesn't matter who voted. This article will go to the homepage.
2. Voting user level. However, Digg's "a couple updates" announced that top users are always accompanied by abnormal and nasty behavior, so this factor will continue to decrease. If you have a lot of friends, you need more Digg to submit the articles to go to the homepage, usually 2 ~ 3 times.
3. Number of comments and the number of ratings. If an article has 40 comments, 20 of which are rated below-4, the article will not go to the homepage.
4. Number of bury instances. We will also consider the bury type, such as repeated stories, spam, and wrong classification. If an article in the upcoming queue gets three bury S, it will always be buried. If the article is on the homepage and has 1000 Diggs entries, it takes about 10 ~ 15 bury entries can make it disappear (the disappearance means that only the final page can be accessed, and this article will not be seen on any category of navigation pages ).
5. Popular ratio of voting users. If 10 ~ More than 70% of the 15 popular ratio users have posted an article, which makes it easy to go to the homepage. You can check the popular ratio of each user on the Digg user page.
DiggAlgorithmIt has been tested for a long time and is constantly corrected. It makes full use of all the information it can collect and is worth learning from.
Like Digg, newsvine is well-considered:
- User reputation;
- Reputation of user friends;
- Comment;
- Domain Name weight;
- Browsing count and stay time.
Reference Source:
1. The Digg algorithm-unofficial FAQ
2. newsvine algorithm and potential ranking factors for exposure
Mode 4. Seeds Mode
This is a probing statistical method for a third-party application to go deep into a social media. Select a key users set in advance (such as the founder and other core users, known as "seeds"), then scan the social graph from these users, and count inbound links and friends, obtain the rankings of Different metrics of the scanned social media. This is the method used by the spinn3r rank. This mode is not limited to computing top users.
The following two techniques are often used:
- Scanning from approved sources: a good algorithm, of course, should start from a good source, so does techmeme and playlist SR;
- Traverse friendship: spammers or users with low levels are unlikely to get connections from seeds.
Well, this is one of the several sort rule algorithms that I have observed in social media. If you have any additional information, please leave a message or follow me.
Zheng, Beijing, 20090905
Other references:
1. What is influencerank?
2. ranking by semantic similarity
We recommend that you read the following articles:
1. Four modes of developing value-added social media;
2. Analyze the network trajectory and fragmentation of a person;
3. [semantic] sentiment analysis direction: 0908;
4. Five techniques for finding experts from the massive data volumes of social media.