Google search result sorting algorithm-search engine technology

Last Update:2017-01-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Matt Cutts is a software engineer at Google's quality management department. His job is to grade a website and develop technologies that prevent fake or junk websites from appearing on Google search results.

One of the most frequently asked questions raised by library administrators is: "What kind of results should be at the top of the search list? How should Google choose ?" Now, quality engineer Matt Kaz has introduced the quick start to explain how Google crawls and indexes on the internet and how it grades search results. Matt also gave advice to the school librarians about how to coach students.

Crawling and indexing

Before you browse a webpage that contains Google search results, many things will happen. The first is crawling and indexing on billions of web pages on the world wide web page. This is done by Googlebot, which is responsible for connecting to global network servers to collect files. Crawlers do not actually roam on the internet. Instead, they access the network server and return to a specific webpage. Then, they scan the webpage to create a hyperlink and compile numbers for each webpage. Crawling can collect a large number of files, but these files cannot be directly used for search.

If there is no index, Google servers will have to read the content of each file every time you search for content such as "civil war" (civil war. Therefore, the second step is to create an index, which requires "conversion" to crawl the obtained data. To avoid scanning every word in each file, you need to write some articles on the data to display all the files that contain a specific word. For example, assume that the word "civil" appears on files numbered 3, 8, 22, 56, 68, and 92, the word "war" appears on files numbered 2, 8, 15, 22, 68, and 77.

Once an index is created, files are graded and their relevance is determined. For example, if a person searches by Google and enters "civil war", two things are required to present and evaluate the search results: one is to search for a webpage containing a user's question; second, the location of the matching webpage is scheduled based on relevance. Google has developed an interesting technology to accelerate the first step: instead of storing all indexes on one computer, it uses hundreds of computers to do this. Because tasks are assigned to many computers, the query results are faster.

To describe this process more vividly, you can imagine the indexing of the next 30-page thick book. If a person searches for several pages in an index, it takes at least a few seconds for each search. But what if you allocate each page of the index to different people? Thirty people search for different parts of the index separately, which is much faster than one person alone. Similarly, Google distributes data to various computers so that files can be searched more quickly.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Google search result sorting algorithm-search engine technology

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support