A new discovery of hot spots--the techmeme! of China

Source: Internet
Author: User
&

Written by

Date

Keywords

Zheng

2007-6-15

Meme Hotspot Tipping Point Techmeme

September 2005, Techmeme predecessor Memeorandum turned out, at that time in North America was also a whirlwind, we put it and Slashdot, Digg these famous sites, and put forward the Slashdot effect similar memeorandum effect.

Techmeme this Gabe Rivera-dominated hot computing engine, the blog list of his own definition of real-time monitoring, through the search blog, news media URL link to tap the dialogue between bloggers, and in the form of dialogue on the first page, Become a very effective content filter, tells us what's hot and what's not.

This algorithm for linking mining hotspots does not work in China, for simple reasons:

Chinese blogs rarely embed in Bowen URL links .

For the same reason, Google's PageRank algorithm does not play much value on blogs.

In fact, since the second half of 2006, we have been secretly developing the content engine, where hotspot auto Discovery (hot point) covers the direction of Techmeme.

In the blink of an eye in 2007, recent media mentions Techmeme.

such as May 25, Sina translation Read/write Web article "The Famous Science and technology blog: Google News and Techmeme big battle."

"Slashdot's influence began to weaken after the first round of the dotcom bust," said the Economist's commercial Review, published by the Economic Watch in June 2007, which featured the online community. In recent years, the emerging techmeme began to replace the former status. ”

Alex Barnett, who has been named Microsoft's top ten hottest blogger, also published the article "How I find stuff I like" on May 23, saying Techmeme was one of his three content filters: "The three main methods I use the to find content I'll be interested in are:2. Techmeme-two or three times daily. Tells me what's hot and what's not. "

Intro

In January 2006, I wrote and published the memeengine discussion set i , two , three (click here to download the full PDF document). It was also noted that several people had announced in the media that they wanted to replicate the techmeme, but then there was no further context. Perhaps it is because Techmeme's link analysis algorithm simply can't move to China.

Always on the road

In March 2006, I started looking for the meme engine with Chinese characteristics and soon found that only text mining algorithms could do it.

The text of the blog content mining, there is still a big problem in China to solve. Blogs are much more complicated than news:

L Text style: Blog style is very different, often do not follow the cards to the card, unrestrained, far more than the standard writing of the news to be difficult to analyze.

L involved in the scope: blog Anything to talk about, big to state affairs, small to personal feelings, and even water accounts.

Information sources scattered: domestic large and small hundreds of BSP, millions of blogs published articles, it is difficult to collect them in the first time and quickly expand large-scale computing.

In September 2006, I worked with Dr. Zhang Junlinghan, a software institute of CAS, to create a network of games, aiming at the future general direction of information filters and human filters.

In October 2006, Dr. Zhang introduced the "Hot Spot auto discovery" algorithm. But this time, the algorithm is not very mature, in the non-event-driven, non-news-driven classification of poor performance, such as: the Internet, gender, in the news-driven star, social performance is good. This situation is not open to the outside.

After we have developed the "topic Cluster aggregation" and "topic time context" algorithm for the content engine, we go back and re-optimize the hotspot auto-discovery algorithm. This time the accuracy has risen to a new height, really can do:

From the crawler crawl to the output of various areas of the hotspot, the entire process without any manual, without editing the audit can be directly released to the ordinary user to see.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.