"The beauty of Mathematics" The 8th chapter of Simple Beauty--Boolean algebra and search engine

Source: Internet
Author: User

Technology is divided into two types of surgery, the specific way of doing things is surgery, the principle and principles of doing things are Tao.

The principle of search engine is actually very simple, build a search engine roughly need to do such a few things:

Automatically download as many pages as possible;

Establish a fast and effective index;

A fair and accurate sort of page based on relevance.
1 Boolean algebra

The theory of Yin and yang in ancient China can be considered as the earliest binary model.

In the 1854, the "Law of thought" for the first time showed people how to solve logic problems with mathematical methods.

The two elements of the and operation have a 0, then the result of the operation is always 0.

The two elements of an OR operation have a 1, then the result of the operation is always 1.

The not operation turns 1 to 0 and turns 0 to 1.

Boolean algebra is equivalent to the meaning of quantum mechanics for physics, which expands our understanding of the world from a continuous state to a discrete state.
2 Index

Every website is like a library book, we can not find it on the library shelves, but to search the card to find its location, and then go directly to the shelves to take.

The simplest index structure is to use a very long binary number to indicate whether a keyword appears in each document.

The early search engine was limited by the speed and capacity of the computer, and could only index important and key words. So far, many academic journals have asked the author to provide 3-5 key words.

Indexes are very large and are stored on different servers in a distributed manner. The common practice is to divide the index into many parts according to the serial number of the page, stored in different servers, each time a query is accepted, the query is distributed to a large number of servers, the server concurrently processing user requests, and the results are sent to the master server for merging processing, and finally return the results to the user.

Different levels of indexing, such as common and very useful, need to be established based on the importance, quality, and frequency of access to the Web page. Frequently used indexes require fast access, additional information, and faster updates, rather than less frequently used requirements.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.