A few days ago and a friend chat, exchanged with each other at the end of August Baidu algorithm updated some of the views. At the end of the time my friend asked me to help him analyze his new station, the site's problem is that snapshots stay on August 15, and fast 20 days Baidu did not include new content. From the simple conversation that the site on-line 2 months, daily adhere to update the original content and the chain. Since Baidu included in the article The next day, the long tail word ranking is also good. However, the snapshot has not been updated since the August 19 snapshot back to 815. Through the analysis of the station's overall structure and content, found that the whole station structure is clear, and there is no serious structural problems, the article content illustrated, writing is also good, and long tail word ranking is really good, which for a new station to do very good.
Why is the snapshot not updated, the new content is no longer included? Is it the problem of Baidu itself, or have other "naked eye" can not see the crux? This time think of the log analysis, sometimes only through the internal look at the problem. From friends to come the day before the site log log, the following image is the log analysis tool to get out of the spider summary screenshot. From the graph we can see the three major search engine spider access times, stay time and total crawl amount of information.
Friend this station belongs to a new station, the chain is not many, the entire website information quantity is not very big. According to the personal experience analysis, the spider single Crawl quantity (total crawl quantity, the number of visits) reaches 80-100 already is a very good figure. Why is the new content of the website not included?
The second step came to the site directory crawling situation. The image below is a screenshot of the three main spiders ' directories grabbing the TOP3 directory, from which we can see that the Archiver directory on the map is much more fetching than any other site directory. This data makes me a little uneasy.
From a friend's website, this is a daily archive directory, from the map we can see the September 05, 2010 Return of the day of the release is empty (friend this station build station only 2 months).
As you can see through webmaster tools, the URL returns a 200 status code. At this time the psychology probably has a bottom, the spider in this directory crawling must have encountered the difficulty.
In order to verify their own judgment, through the EditPlus opened the log file, as expected, several large spiders crawl archive directory when caught in a dead end.
The crux is found, and the next is how to deal with the problem. Because this document archiving function is a friend to buy a plug-in, so friends still want to be able to keep this daily archive column, after all, money spent to use, do not ask the column can bring traffic but it is a useful supplement. Just beginning to consider through nofollow to screen spiders crawl, but think to go or not, after all, has been included in the page or to the spider crawling space, and will appear in the dead cycle of the article.
Finally, give friends two suggestions:
1, contact the plugin developer to fix this bug;
2, delete the entire archive directory to return 404 status Code, and in the Robots Screen archive directory;
This is the whole diagnostic process. Many times when there are problems in our website, please open your Web log logs, carefully compare and analyze the data inside, will be able to find the problem where you have a lot of help.
This article by the thin weight slimming drug list Www.shou68.net original feeds, Welcome to reprint, reprint, please keep this link, thank you for your cooperation!