Do you know why spiders don't visit the site?

Source: Internet
Author: User
Tags log

Believe that a lot of stationmaster is same as small weave, every day there is the habit of viewing the site log, and then through the analysis of the log to understand the spider crawling in our site, of course, in this process small and medium-sized usually just look at the spider every day to crawl the number of my site, from this point can be intuitive to see the health of the site, But for many stationmaster, their website feels very perfect, but the spider does not crawl crawl, this undoubtedly has a very big blow to the stationmaster, so today small arranges specially to tidy up some spiders not to visit our website several reasons, below to share to everybody.

(a) Web site Flash, too many pictures.

Have to say, the major search engines have developed to the present, is quite intelligent, especially in recent years, the major search engine constantly updated algorithm, for example, in 2011 when Google launched the image search function, we can find from these images source page address, but after all, is the search engine, He still has a very small gap. However at this time many stationmaster, especially the enterprise station, in order to highlight the product, inserts the massive flash, the picture in the website, but everybody knows the search engine spider is unable to crawl the flash, and the picture file, therefore even our content in good, also can not let the spider crawl crawls. So for those sites there are a large number of pictures and flash sites, small series suggest that you use some simulation spider crawling tools to detect their own site, to see if it is because of this situation caused spiders do not visit our website.

(b) There are a lot of dead links on the website.

We fantasize that whenever the spider is happy to visit our website, crawling through the source of our site links, the thought can find some fresh good content, but is a lot of dead link waiting for it, so once, twice, three times, every time there is such a situation, So do you think spiders will have a good rating on your site and will crawl the content of your site again? Just like my website www.qqya.cc was just because a lot of 404 pages were crawled by spiders, caused that section of the site content is not indexed by search engines, so later small series in the observation site log found spider Crawling page contains a status code of 404 of the page before it dawned, so also immediately to these 404 pages to clear , shielding, and in that period of time sent a lot of outside the chain, then the web site spider crawling crawl also normal.

(iii) outside the site chain and nofollow tags.

Spiders can often visit our website reason, believe that a large part of the chain is attracted, so for the friendship of the chain, we also need to constantly observe, lest did in vain. Believe that a lot of stationmaster this time should ask, why so say, how to judge the friendship degree of the chain? In this small series is done, every day to check the number of spiders visit our website, but also to check the spider's entrance, through these entrances to determine whether we do the chain is valuable, so a daily summary, The effective chain records down, then time to build our own outside the chain resource database slightly, of course from these data we can also determine those are useless outside the chain, for example, with the nofollow tag of the chain (such as Baidu experience) for this kind of words we will not do in the future, because it is futile ah.

(iv) There are complex codes in the Web site that are structured.

As we all know, the spider is through the site's source code to visit our website, so we need to optimize our site code, because the lengthy code on the site does not have any meaning, and will affect the speed of our site open, and will make spiders disgusted. The other is that the structure is also not conducive to crawling spiders crawl, because usually spiders are the first to visit our home page, and then to grab columns, in the crawl content of such activities, however, such activities are usually limited by spiders within 3 times, so too complex, and the weight of the site is not high, is not conducive to spiders crawling oh.

(v) the wrong site map.

Site map is to let spiders quickly understand the entire structure of our site a page, then a good site map, spiders will crawl to bring a quick entrance. Of course, if there is a large number of dead chain error map, will undoubtedly destroy the entire Web site crawling state, so we do in the site map must be cautious.

Summary: In fact, spiders do not visit our site, in addition to the above mentioned above, there are many other reasons, such as illegal content, a large number of collected content, yellow malicious information and so on are likely to cause our site is not search engine spiders crawl, so the specific details of the problem, We also have to judge from the line according to our own website log. Today to write here, this article by the http://www.name2012.com/game name site long original share, reprint please bring the link, thank you.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.