The third law of search engines-search engine technology
Source: Internet
Author: User
Today, the search engine is over and opening up the future. To clarify the third law, let's first review the first and second law.
........................................ ........................................ .....
■ Law 1 correlation law
It sounds like an academic paper. Even the first and second laws have never been mentioned before, but first, the content of the second law has already been recognized by the industry and academia. In fact, this first law has been widely studied by academia before the emergence of the Internet, that is, the so-called correlation law. At that time, this field was called information retrieval, or information retrieval, or full-text retrieval.
At that time, the correlation was based on word frequency statistics. That is to say, when a user inputs a keyword, the search engine finds the keyword that frequently appears in the article (webpage), and the location is more important, add some weighting to the degree of common use of the search term, and finally output a result (the search result page ). The sorting of search engine results in the early days was based on the first law in this article, such as Infoseek, Excite, and Lycos. They basically followed the research results of academic circles before the internet age, the industry focuses mainly on processing large volumes of traffic and large volumes of data, and there is no breakthrough in relevance sorting.
Word frequency statistics do not actually use any network-related features. They are the technology of the Internet era before. However, the main documents in the Internet age exist in the form of web pages, and almost everyone can post a variety of content on the internet as they like. The quality of the two webpages with the same word frequency can be very different, however, according to the first law of the search engine, the two webpages should be sorted in the same order. In order to be able to dispatch the first few of some search results, many creators of Web content have racked their brains to pile up keywords on their pages. The search engine is so miserable. This situation has changed since 1996.
........................................ ........................................ .....
■ Law 2 popularity quality law
In April 1996, I went to Las Vegas to hold an academic conference on Information Retrieval. The content of the conference was just like the weather in Las Vegas. It was as boring as it is. But I am far away from the company, but it is rare to have a chance to calm down and seriously think about the problem. When I was listening to an unrelated paper speech, I suddenly linked the scientific citation index mechanism with the hyperlink on the Web-thanks to Peking University, when I was in my junior year, she taught me the scientific citation index mechanism. I'm afraid no university in the United States will teach you this art in your undergraduate course.
The mechanism of scientific citation index. To put it bluntly, whoever has been cited for many times is regarded as authoritative, and the paper is a good paper. After this idea is transplanted to the Internet, the web page is considered to be of high quality and popularity. You can use the link text analysis to sort the search results. This leads to the second law of search engines: the law of popularity quality. According to this law, the relevance sorting of search results is not entirely dependent on word frequency statistics, but more dependent on hyperchain analysis.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.