Three Laws of search engine ranking 1-search engine technology

Source: Internet
Author: User
Search engine technology and trends
[From search engine Express]
With the rapid development of the Internet and the increase of WEB information, users need to find information in the ocean of information, just like haystack, the search engine technology solves this problem (it can provide information retrieval services for users ). Currently, search engine technology is becoming the target of research and development in the computer industry and academia.
With the rapid increase of WEB information, Search Engine has gradually developed since 1995. According to the July 1999 article WEB information accessibility published in Science, it is estimated that there are more than 0.8 billion million WEB pages and more than 9 TB of valid data in the world, and it continues to grow at a rate that doubles every four months. To find information in the vast ocean of information, users will inevitably return to the sea without success.
Search engines are the technologies that have emerged to solve this "lost" problem. The search engine collects and discovers information on the Internet based on certain policies, and understands, extracts, organizes, and processes the information. It also provides users with retrieval services, thus serving the purpose of information navigation. The navigation service provided by search engines has become a very important network service on the Internet, and search engine sites are also known as "network portals ". Therefore, the search engine technology has become the target of research and development in the computer industry and academia. This article aims to give a brief introduction to the key technologies of search engines, so as to serve as a reference.

I. Classification
Based on different information collection methods and service provision methods, search engine systems can be divided into three categories:
1. Directory-based search engine: it collects information manually or semi-automatically. After the editors view the information, they manually form an information abstract and place the information in a pre-determined classification framework. Most of the information is intended for websites and provides directory browsing and direct retrieval services. This type of search engine is intelligent, so the information is accurate and the navigation quality is high. The disadvantage is that manual intervention is required, the maintenance volume is large, the amount of information is small, and the information is not updated in a timely manner. Such search engines include Yahoo, LookSmart, Open Directory, and Go Guide.

2. robot search engine: a robot program called Spider automatically searches for and discovers information on the Internet based on a certain policy. The indexing tool builds an index for the collected information, the searcher retrieves the index database based on the user's query input and returns the query result to the user. The service is a full-text search service for web pages. This type of search engine has the advantages of large amounts of information, timely updates, and no manual intervention. The disadvantage is that it returns too much information and has a lot of irrelevant information. Users must filter the results. Such search engines are represented by AltaVista, Northern Light, Excite, Infoseek, Inktomi, FAST, Lycos, and Google. Domestic representatives include "Skynet", Youyou, and OpenFind.

3. meta-search engines: These types of search engines do not have their own data. Instead, they submit query requests to multiple search engines at the same time, and process the returned results repeatedly, such as sorting and sorting, return to the user as your own result. The service is Web-oriented full-text retrieval. The advantage of this type of search engine is that the returned results have more information and are more comprehensive. The disadvantage is that the function of the search engine used cannot be fully used and users need to perform more screening. Such search engines are represented by WebCrawler and InfoMarket.
II. Performance indicators
We can regard WEB information search as an information retrieval problem, that is, to retrieve documents related to user queries in a document library composed of WEB pages. Therefore, we can use the Recall and Pricision parameters of traditional information retrieval systems to measure the performance of a search engine.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.