Several spiders can help optimize the Web site to crawl related concepts

Source: Internet
Author: User
Keywords Spiders and SEO
Tags abstract access analysis content content update data feedback help

Abstract: Search engine spider is a search engine itself a program, its role is to visit the Web site, crawl the text of the page, pictures and other information, set up a database, feedback to search engines, when users search, the search engine will

Search engine spider is a search engine itself a program, it is the role of the Web site to visit, crawl the text of the page, pictures and other information, set up a database, feedback to the search engine, when the user search, the search engine will collect the information filtered, The complex sorting algorithm presents what it considers to be the most useful information for the user. In-depth analysis of the site's SEO performance, the general we will consider the search engine spider quality, and which can help us optimize the site may involve the following several spiders crawl related concepts:

1. Crawl rate: The number of pages that a Web site gets by spiders in a given time.

2. Crawl frequency: How often does the search engine launch a new crawl on a website or a single Web page?

3. Crawl depth: A spider from the start position can click to how deep.

4. Crawl saturation: The number of unique pages that are fetched.

5. Crawl priority: Those pages are most often used as a spider's entrance.

6. Crawl redundancy: The site is generally how many spiders crawl at the same time.

7. Crawl mapping: Spider crawl path restore.

These are some of the concepts we can use to do data analysis, so how to apply them to the SEO? Let me briefly talk about some of my specific ideas.

1. Analysis of creep rate validation fuzzy empirical theory

Analysis of the search engine spiders first consider a parameter is the crawl amount, generally we consider the amount of spider crawl Unit is a day for a period of time, then we often consider is the one-day crawl rate. Of course, you can also adjust the time limit according to your own needs, such as divided into every hour, to fully understand the spider's grasp of the time period, and then targeted to make some adjustments. Among them, I think a kind of analysis can bring us a lot of sense of achievement, that is, the validation of some fuzzy empirical theory.

For example, we often hear such a sentence: "Do site content when the time to quantitative updates, training search engine spiders crawl habits, random changes to update the time, may affect the spider on the content of the site crawl", this sentence is correct? Here you can use the site's log to analyze the search engine spiders crawl rate to explain. The specific operation method is to one months each day every little time of the spider crawl situation to split statistics (pay attention to the rationality of the Data sample selection), and then for each time period analysis, contrast can find which period of search engine spider to more frequently, A comparison with your own content update can quickly come to a conclusion.

2, improve the climbing frequency to enhance the collection

Search engine spiders crawl frequency is often determined by the quality of the site content, because only the site has more fresh and better content, in order to attract spiders to repeatedly crawl, such as many large content-type sites, updated a lot of content every day, so that the spider has been left in the station, the page crawl frequency will naturally improve. and enhance the crawl frequency, for the page content and link update will be search engine spiders faster crawl to, can be more fully included in the site's page content information.

Many friends said that their own site snapshots do not update, or lag for several days, the individual felt that spiders are not enough to capture the frequency caused. To update the snapshot quickly, especially the new station, the prophase must do more content construction, content page if there is no content update, the general spider may not crawl included, or crawl but do not return data, when the next user search may call the search engine database stored some data.

3, study spiders crawl habits and optimize habits

As for the back of the search engine spiders crawl depth, saturation, priority crawl, redundancy and crawl path are for spiders crawl habits and crawl strategy research, because they do not targeted to practice analysis, so can only theoretically talk about their own ideas.

Search engine spiders crawl depth if not deep enough, mainly because the site in the structure of the layout of the time did not consider whether the spider can be fully crawled or whether in accordance with a layer of crawl, here is related to the layout of the chain access port, but also take into account the spider to climb the priority of some of the entrances, many times to some large site to do the diagnosis, To enhance its flow and include the main strategic layout is to optimize the spider crawl priority of the portal, the realization of the method is to use nofollow tags to screen some pages. Also need to analyze the saturation can be crawled, because for a single page crawling too much is wasting spider resources, if we can properly control the allocation of these resources, then for the page crawl and included in the promotion must have a great help.

As for the redundancy and crawl path may need to be in-depth analysis, and then if there are in-depth articles will be discussed with the spider crawl.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.