Third-generation search engine technology and P2P-search engine technology

Source: Internet
Author: User
Although the second generation of search engines have improved the search speed and extended multi-language information compared with the first generation, they have also made some exploration in the natural language as the query language. However, with the strong development of the Internet, the contradiction between the huge digital information on the Internet and the ability of people to obtain the required information is becoming increasingly prominent. According to a report released in the second half of 2001 by IDC, the search engine technology promoted in the early stage as "easy to use, rich search results" is being replaced by a more centralized Lan, because the performance of most search systems is too different from the user's expectations, multimedia information such as video and audio retrieval with high data volume growth still cannot be broken through.
Generally, public search engines can only find the HTML format. The main reason is that Spiders, the search engine's automatic sorting software, can only accept webpages in this format. This means that, on the enterprise's internal LAN, any information without HTML format cannot be found by external search engines. This is why information in databases such as PPT, Word, PDF, and email files, as well as ERP, CRM, and other application software, has been "sunk" on the bottom of the information.
How to solve these problems has become the direction of the third generation search engine exploration. A good search engine is no longer measured by the database size, update frequency, search speed, and multi-language support. As the database capacity expands, how to precisely find the right information from a large database is recognized as a key point of competition for the next generation of search technology. For example, if a search engine queries the word "tourism" and returns more than 1 million pieces of information, assume that a person can view a webpage in 3 seconds, even if only 10% of the webpages are viewed, it takes more than 10 hours to watch the video continuously.
Fortunately, the rapid development of search engine technology, such as intelligent, personalized features of the new engine compared with the past search engine is a big difference. Intelligent search can improve the accuracy of search results by automatically learning the relevance of search content. However, there is no practical way to achieve intelligence, and it is difficult to display the required information in the search results on the first two or three pages.
Another notable search technology is the application of P2P technology to web page retrieval. By sharing files, directories, and even the entire hard disk on all hard disks, you do not need to use a Web server when searching, and are not restricted by the information document format, it can reach the unparalleled depth of traditional directory-based search engines (traditional engines can only reach 20% ~ 30% of network resources ). I5 Digital, an emerging search engine design company in the United States, launched a commercial search engine Pandango (www.pandango.com) based on the concept of peer-to-peer search two years ago, but it has not yet entered the mainstream search engine lineup, it indicates that P2P search can only be called a future technology.
"I first heard about the concept of P2P search at Infoseek at the end of 1997. At that time, someone in Infoseek proposed and began to consider this search technology," Li Yanhong said, "Each website has its own small search engine. You can communicate with each other. If this engine cannot be found, you can use other engines to check the engine. This is a concept. However, so far, it is far from the actual application, mainly because it violates the speed issue in key indicators. Because there are many such small engines that are independent from each other and linked to each other, the speed is certainly much worse than the centralized management search engine ".

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.