Spider access from search engine spider access log

Source: Internet
Author: User

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

In order to better observe the Web site by spiders crawling law, I rented the server did not provide access to the log, had to, spent a lot of time to write a special analysis of spider crawling based on PHP program, after three months of observation of several target sites, the following several small experience to share, of course, due to limited research , there must be insufficient or wrong place, please don't throw bricks at me.

I. Baidu Spider

During this period I went to two new websites and found that Baidu Spider general 1-3 days can crawl to the home page, began to update very fierce, about will last two days to one weeks, three days after the site in Baidu to the home page, although Baidu spider crawling tens of thousands of pages, but often will only include several pages, two weeks later, Baidu will crawl only one or two times a day home, other pages rarely crawl, the process will continue for some time, long is a few months, a short few days. But Baidu in this period of time included in the amount will increase. This period of time may be the study period. In this period of time, I was a station by Baidu K, spiders will not come. After this time period, Baidu Spider visit will tend to stability, I have two stations every day only to crawl 200 to 300 times, the amount of change is not small. And I another station shop.hhbmw.com probably because the chain more, Baidu spider come relatively diligent, nearly one months, visit 20,000 to 80,000 times a day, fluctuation is relatively large, but, site, Baidu is not high, this may be to the next big update Baidu will respond to the results.

Baidu spiders to visit the target page, will be the URL of the character character into Chinese characters, such as http://shop.hhbmw.com/proview/%E9%99%86%E5%BB%BA%E5%86%9B88/ 6c318ea2660bcc4b73b220e16edf96b3.htm will become http://shop.hhbmw.com/proview/Lu Jianjun 88/ 6c318ea2660bcc4b73b220e16edf96b3.htm, that is, "%e9%99%86%e5%bb%ba%e5%86%9b88" converted to "Lu Jianjun 88", so that there will be a problem, if the host to the Chinese URL does not support, May affect the inclusion of Baidu.

Baidu Spider visit a site, its visit also has certain rules, many are according to the Chinese character sequencer to visit.

Second, Google spider

Google spiders find the new site quickly, but included relatively smooth, daily crawl page number is also more stable, the higher the PR, the more external chain of the site update faster. Conversely, GOOGLE PR low web site update slower.

Three, search 捭, Sogou, Youdao spider

Update faster, but not too stable, daily access fluctuations are also relatively large, more elusive than Baidu, I have a station was searched and Sogou All k have only left home.

Four, Yahoo, MSN

Yahoo update fast, but included less, MSN update extremely slow.

For robots.txt support, Baidu, Google, search 捭, Sogou, Yahoo, MSN and other spider support is better, the Crawl-delay grammar can also be very good support.

and Youdao spiders basically ignore robots.txt crawl-delay grammar.

With today's access log screenshot:

  

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.