Third-generation google ranking search engine technology and P2P
Source: Internet
Author: User
Although the second-generation google ranking search engine is better than the first-generation search engine in terms of search speed and expansion of multi-language information, it also made some exploration in the natural language as the query language. However, with the strong development of the Internet, the contradiction between the huge digital information on the Internet and the ability of people to obtain the required information is becoming increasingly prominent. According to a report released in the second half of 2001 by IDC, google's ranking search engine technology, widely promoted in the early stage as "easy to use, rich search results", is being replaced by a more centralized Lan, because the performance of most search systems is too different from the user's expectations, multimedia information such as video and audio retrieval with high data volume growth still cannot be broken through.
Generally, public search engines can only find the HTML format. The main reason is that Spiders, the search engine's automatic sorting software, can only accept webpages in this format. This means that, on the enterprise's internal LAN, any information without HTML format cannot be found by external search engines. This is why information in databases such as PPT, Word, PDF, and email files, as well as ERP, CRM, and other application software, has been "sunk" on the bottom of the information.
How to solve these problems has become the direction of the third generation search engine exploration. A good search engine is no longer measured by the database size, update frequency, search speed, and multi-language support. As the database capacity expands, how to precisely find the right information from a large database is recognized as a key point of competition for the next generation of search technology. For example, if a search engine queries the word "tourism" and returns more than 1 million pieces of information, assume that a person can view a webpage in 3 seconds, even if only 10% of the webpages are viewed, it takes more than 10 hours to watch the video continuously.
Fortunately, google's ranking search engine technology is developing rapidly. New engines with intelligent and personalized features are significantly different from those of previous search engines. Intelligent search can improve the accuracy of search results by automatically learning the relevance of search content. However, there is no practical way to achieve intelligence, and it is difficult to display the required information in the search results on the first two or three pages.
Another notable search technology is the application of P2P technology to web page retrieval. By sharing files, directories, and even the entire hard disk on all hard disks, you do not need to use a Web server when searching, and are not restricted by the information document format, it can reach the unparalleled depth of traditional directory-based search engines (traditional engines can only reach 20% ~ 30% of network resources ). I5 Digital, an emerging search engine design company in the United States, launched a commercial search engine Pandango (www.pandango.com) based on the concept of peer-to-peer search two years ago, but it has not yet entered the mainstream search engine lineup, it indicates that P2P search can only be called a future technology.
"I first heard about the concept of P2P search in Infoseek at the end of 1997. At that time, someone in Infoseek proposed and began to consider this google ranking search technology," Li Yanhong said, "Each website has its own small search engine. You can communicate with each other. If this engine cannot be found, you can use other engines to check the engine. This is a concept. However, so far, it is far from the actual application, mainly because it violates the speed issue in key indicators. Because there are many such small engines that are independent from each other and linked to each other, the speed is certainly much worse than the centralized management search engine ".
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.