Read a lot of seo reference books always feel that these books on the principle of the search engine included too general, can not be well understood, today I spend a day clear search engine included in the principle, if you have any questions, hope you master seo Ax is positive, little thanks.
Search engine spiders, spiders, robots is what? - Search engine included principle
Search engine in order to be able to make their database strong enough, comprehensive enough, day and night to find new and more reliable information on the network, but with the advent of the Internet era, the explosive network of information, the whole artificial It is possible to accomplish such a task, so search engine owners have developed a set of programs that can be used to catch information day and night, organize, sort information, and finally index this information into its own database.
There are many names for this program that crawls web site information around the clock, such as spiders, reptiles, robots, and detectors. General search engine can send N more than one crawler at the same time, they crawl through the URL of a page of a page to the site's title, description, pictures, web content, etc., and then grab the information back into a dedicated warehouse, waiting for the index .
However, the designer of the website can not guarantee that the website he designs is perfect and there are many problems. For example, there are dead links in the web page and too many web pages, which cause the crawler to fail to accurately crawl the contents of the entire page , Perhaps the crawler only scratched the head of this page, scratching his body and found themselves where the store information is not enough, had to leave. Therefore, we should pay attention to these issues in the design of the site, it is recommended that web site designers can make the web design easy to accept reptiles.
Introduction to Google's two crawler procedures
Here we take the best search engine google, for example, to analyze how the search engine crawling information, how to deal with the information.
There are two types of Google crawlers: refreshing crawlers, crawling crawlers, refreshing the crawler's information around the clock in a specific database, refreshing crawlers and providing search results with major indexing programs, sometimes You will find that your page update suddenly appeared in the search results page, but after a while and suddenly disappear, it is because the refresh reptiles in non-stop capture information, non-stop rewrite, give me the feeling to refresh Crawler storage mechanism is more like a data structure in the stack, advanced out, after the first in, out of this period of time you do not worry, or that look like a non-stop update in January will slowly appear in the search Results, but now may not wait so long. If your page is already in the search engine's index, refresh the crawler Once you find your update, the crawler will quickly display your updates, but still not stable enough to wait until the depth crawler updates the main index of your The page can be stable enough.
Here we use a simple process to introduce the search engine included under the process:
Crawler crawler -------- "found information ----------" crawl information ---------- "on a dedicated database ------- ---- "waiting for index finishing ----------" Index finishing (depth crawler access the main index) ------- "Index completed, a keyword ranking has been calculated - ----- "Waiting for user search --------" offer the result.
How many search engines provide results?
Search results provide two kinds of search results, it is recommended that you should do all seoer, I was learning, I hope to get expert guidance January 2.
Three search results 1 Content Index Results 2 Special index results, the former is the keyword and the title of the page, description, link source text and other forms of text index and compression. The latter contains a picture index, PDF file index and other special index, it is recommended that you seoer do not underestimate the second search results, this also can bring considerable traffic.
Summary: Search engine included principle is basically these, and if you have any questions, please correct it in time, brother change, huh, huh.