Third generation search engine technology and Peer-to-peer
Source: Internet
Author: User
KeywordsSearch engines these
Intermediary transaction SEO diagnosis Taobao guest Cloud host technology Hall
second generation search engine, although compared with the first generation in the search speed, the expansion of a variety of language information to improve, in the natural language for the query language has also done some exploration. However, with the strong development of the Internet, the contradiction between the huge digital information on the Internet and the ability of people to obtain the necessary information is increasingly prominent. A report published by IDC in the second half of 2001 shows that search engine technology, which was advertised as "easy to use and rich in search results", is being replaced by a more centralized LAN, because most search systems have a much different performance than users ' expectations, such as video with high data volume growth, The retrieval of multimedia information such as audio is still a difficult problem to be overcome.
general public search engine can only find the HTML format, the main reason is the search engine automatic sequencing software spiders Spider program, can only accept this format Web page. This means that on the intranet, any information that is not in HTML format will not be available to external search engines. This is why information such as PowerPoint, Word, PDF, e-mail, and the database of applications such as ERP and CRM can be "sunk" in the seabed of information for a long time.
How to solve these problems has become the third generation of search engine exploration direction. A good search engine is no longer only by the database size, update frequency, retrieval speed, the support of multiple languages of the basic characteristics to measure, with the constant expansion of database capacity, how to accurately find the right data from the vast data base, is recognized as the next generation of search technology competition. For example, in a search engine to query the word "travel", the return of more than 1 million information, the assumption that a person 3 seconds to view a Web page, even if only to see 10% of the pages, a moment to continue to look at it will take more than more than 10 hours.
the rapid development of search engine technology, such as intelligent, personalized characteristics of the new engine and the past search engine has a big difference. Intelligent search can improve the accuracy of search results by automatically learning the relevance of search content. However, there is no practical way to truly intelligently implement the required information to be displayed in the first two or three pages of search results.
Another popular search technique is to apply peer-to-peer technology to Web page retrieval. By sharing files, directories, and even the entire hard drive on all hard drives, users can search without using a Web server and are not limited by the format of the information document to reach the depth of the traditional catalog search engine (traditional engines can only reach 20%~30% network resources). i5 Digital, a new American search engine design company, formally launched a commercial search engine Pandango (www.pandango.com) based on Peer-to-peer search two years ago, but has yet to enter the mainstream search engine lineup. It shows that Peer-to-peer search is now only known as the technology of the future.
"The concept of Peer-to-peer search I was first heard in InfoSeek at the end of 1997, when InfoSeek had already proposed and began to consider the search technology," Li said, "Each site has a small search engine, we can communicate with each other, If this engine is not found, it can be checked by other engines, which is the concept. But so far, it has been very far from the actual application, mainly in violation of the key indicators related to the speed of the problem. Because there are many of these small independent and interconnected engine, the speed and centralized management of the search engine will certainly be much worse.
There is always a certain distance between
business applications and academic research, but this does not mean that businesses are not paying attention to the pursuit of technology, especially companies like Google that are already in the pyramid of the field. Google has an open database of more than 100 projects that need to be implemented in the future, and these projects will be promoted by 50 PhD computer scientists. In June 2002, Google set up a "lab" to showcase the latest technology in Internet search and publish it on the Internet (labs). google.com) for public trial, extensive collection of user feedback. Projects already shown in the lab, including keyboard search, voice retrieval, etc.
Some people may think that these so-called experimental projects do not seem to see the search engine technology will be a big change in the concept, in fact, search engine technology in more than 8 years of time has been a gradual process of development. "A search engine does not say that one aspect is good enough to be liked by the public, and that it has to be done in every way," Li says. This is also the mainstream search engine now more attention to the details of the reasons.
Anyway, leaders in search technology, including Google's Peje, believe that the ultimate search engine will be intelligent and able to understand everything in the world. Peje, an active participant in Web services technology, is trying to apply Web services technologies to search to address cross-platform, multiple-format information retrieval. And what we see now is that the mainstream search technology focuses on improving the quality of its search engine, expanding its scope of application, such as supporting image retrieval, PDA and other mobile handheld devices, which will be the necessary steps in the next generation technology implementation process.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.