Uncover the mystery and analyze the Search Engine Principles

Source: Internet
Author: User

On the vast Internet, especially on the Web (World Wide Web), the Internet is not accessible without searching. Are you familiar with search engines? How do they work? Which search engines do you use? Today, I will talk to you about the search engine.

I. Search Engine Classification

We can call it a search engine to obtain website webpage information, establish a database, and provide a query system. According to the working principle, they can be divided into two basic categories: Full Text Search Engine (Fulltext search engine) and classification directory ).

Full-text search engine databases rely on software called "Web Robots" or "web crawlers, A large amount of Web Page Information is automatically obtained through various links on the network, which is analyzed and organized according to specific rules. Google and Baidu are typical full-text search engine systems.

Classification directories are collected manually to form databases, such as the classification directories of Sohu, Sina, and Netease in Yahoo China and China. In addition, some navigation sites on the Internet, can also belong to the original category directory, such as "home" (http://www.hao123.com /).

Full-text search engines and classification directories have different lengths. Because full-text search engines rely on software, the database capacity is very large, but its query results are often inaccurate. Classification directories rely on manual collection and sorting of websites, it can provide more accurate query results, but the collected content is very limited. To complement each other, many search engines now provide both types of queries. Generally, queries for full-text search engines are called searches for "All websites" or "All websites ", for example, Google's full-text search (http://www.google.com/intl/zh-CN/); the query of the classification directory is called the search "Classification directory" or search "Classification site", such as Sina search (http://dir.sina.com.cn /) and Yahoo China search (http://cn.search.yahoo.com/dirsrch ).

On the Internet, the integration of these two types of search engines also produces other search services. Here, we can also refer to them as search engines, mainly including:

Meta Search Engine ). Such search engines generally do not have their own network robots and databases, their search results are displayed on the same interface in a unified format by calling, controlling, and optimizing the search results of multiple other independent search engines. Although the meta-search engine does not have a "Web robot" or "web spider", it does not have an independent index database. However, in terms of retrieval request submission, Retrieval Interface proxy, and retrieval result display, they all have their own special meta search technology. For example, "metafisher Meta Search Engine"
(Http://www.hsfz.net/fish/), it calls and integrates data from multiple search engines, such as Google, Yahoo, alltheweb, Baidu and openfind.

Integrated search engine (all-in-one search page ). The integrated search engine uses network technology to link Multiple independent search engines on a single webpage. When querying, you can click or specify a search engine. Multiple Search Engines can be queried at the same time, search results are displayed on different pages by search engines, such as the Internet Swiss Army Knife (http://free.okey.net/%7Efree/search1.htm ).

Ii. How search engines work

the full-text search engine's "Web robot" or "web spider" is a kind of software on the network. it traverses the web space and can scan websites within a certain IP address range, and collect webpage information from one webpage to another along the links on the network. To ensure that the collected data is up-to-date, the system will return to webpages that have been crawled. Other Programs are required for analysis on Web pages collected by Web Robots or web crawlers, algorithm is used to create a webpage index for a large number of calculations before being added to the index database. The full-text search engine we usually see is actually a Retrieval Interface of the search engine system. When you enter keywords for query, the search engine will find indexes of all related webpages that match the keyword from a large database and present them to us according to certain ranking rules. Different search engines have different Web index databases and different ranking rules. Therefore, when we use different search engines to query with the same keyword, the search results will be different.
[from: http://www.pconline.com.cn/pcedu/soft/wl/#/0408/437918.html]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.