The beauty of Mathematics, the 8th chapter--Boolean algebra and search engine

Source: Internet
Author: User

Technology is divided into two types of surgery and road, the specific way of doing things is surgery, the principle and principle of the way.

The principle of search engine is actually very simple, build a search engine roughly need to do such a few things:

Automatically download as many pages as possible;

Establish a fast and efficient index;

A fair and accurate ranking of Web pages based on relevance.
1 Boolean algebra

The theory of Yin and yang in ancient China can be considered as the earliest form of binary system.

In the 1854, the law of thought of Boolean first showed people how to solve the logic problem by mathematical method.

The two elements of an and operation have one is 0, the result of the operation is always 0.

The two elements of an OR operation have one is 1, the result of the operation is always 1.

The not operation turns 1 to 0, and turns 0 to 1.

The meaning of Boolean algebra for mathematics equates to the significance of quantum mechanics to physics, which extends our understanding of the world from continuous state to discrete state.
2 Index

Each site is like a library book, we can not find it on the library shelves, but to find its location through the search card, and then directly to the bookshelf to take.

The simplest index structure is to indicate whether a keyword appears in each document with a very long binary number.

Early search engines can only index important and key words because they are limited by the speed and capacity of the computer. So far many academic journals have asked the author to provide 3-5 key words.

The indexes are very large and are stored on different servers in a distributed way. The common practice is to divide the index into many parts based on the number of pages. stored separately in different servers, each time a query is accepted, the query is distributed to a large number of servers, which simultaneously process the user requests and send the results to the primary server for merging and finally return the results to the user.

Indexes of different levels, such as common and extraordinary use, need to be established based on the importance, quality, and access frequency of the Web page. Frequently used indexes require faster access, more information, and faster updates, rather than less common requirements.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.