Intermediary transaction SEO diagnosis Taobao guest Cloud host technology Hall
Hello everyone, I am Chengzhou. SEO for the diagnosis has been my insistence on a job, before talking to you about a lot of the idea of SEO diagnosis, a lot of people from some friends of the inquiry, as well as the discovery of their own website problems and research. Today I will bring you a diagnosis case, the main problem type for the site included and snapshots are not timely issues.
Yesterday a friend found me, let me help him diagnose the symptoms of the site, first and his exchange, understand some of his website situation: his site has updated every day, but Baidu the next day did not include release, only in the weekly update or monthly updates will release a lot of pages before, the snapshot is also updated very slowly, But also with the collection released slowly keep up, the situation has lasted one months. Below is one of my ideas, I hope to have some help.
First of all I suggest that this friend check the log logs of the site, because log logs can reflect Baidu spiders inside the site crawl. As far as I know, many friends at present do not have the habit of checking log log, or more to see log logs helpless, this friend is, he said he has to view log logs, but do not know how to analyze. Here is a brief introduction to my analysis ideas.
1, view search engine Spider's Crawl summary analysis, understand the search engine spiders, total stay time, total crawl volume and proportion, the following is a summary of the friend site analysis (using light years log analysis tool to view the results), which can be clearly seen, Baidu Spider for the site's crawl volume is still good, There are 292 words, the number of visits 126 times, the total stay time of 8.873 hours, accounted for the proportion of all spiders 41.011%.
Site Log Log Summary analysis
Some friends may have doubts, since Baidu Spider total stay so long, single stay Time is not low (PS: Spider single Stay time = Total stay time/visit number = 0.0704 hours/times = 4.225 minutes), then why the site is not keep up with the collection? With such doubt, The following analysis of Baidu Spider for other pages crawl.
2, see Baidu Spider for the site Directory crawl situation, from the following crawl figure can be very clear to see Baidu spider for home,product directory crawl or more, for the site of another important directory news crawl but not much, And this directory is the site daily update to do more work directory. Spiders also crawl some background file directories such as upload,files,img.
Site Directory Crawl
From the above analysis can be seen Baidu spiders crawl inside the Web page is unreasonable, the main problem is: 1, the site's internal structure for news column crawl strength is insufficient; 2, the home directory as a community, product catalog as a products center, dispersed the site spiders crawl resources; 3, Web sites do not have a good limit for some directories that are not necessarily crawled. To solve the problem, we need to start with these three aspects.
1, guide spiders more crawling news directory below the Web page, such as for the updated page to create more chain access, including the station in the link to each other, outside the chain of outbound release.
2, view the site of the community has basically no update, you can consider the home directory to screen out, let the weight and spiders more imported into the news column. The product catalog is a more important directory, but more should be guided to capture products that are not included or new. View the Site product page, found that the relevant product recommendations are not enough, can be improved in this piece.
3, the use of robots or nofollow tags, the site is not important to the directory or page, as well as some of the backstage file restrictions.
In addition, there is a bit of thinking, Baidu Spider stay more time, but the amount of capture is not particularly much, but also need to see Baidu Spider crawl page return status code 200,304 and 404 of the percentage. If more than 304, then whether to consider the spider resources are reasonably allocated to the page is not crawled. If more than 404 pages, it is necessary to consider whether the spider was brought into some traps inside, check the site to create a 404 Status code page, and to correct.
mentioned above is the site of the spider crawl allocation, but if the spider outside the station to guide enough to force, then still can not do a good job of site collection and weight promotion (PS: Site Snapshots is a reflection of the weight of the site). You can manipulate the following methods:
1, re-establish the site map, including HTML and XML two format map, in the robots file to write crawl rules, guide spiders crawl sitemap map. The wording is as follows:
Sitemap:http://www.xxx.com/sitemap.html
2, more than the establishment of chain outside the chain, to the site of each page as many spiders crawl the entrance, so that the Web page crawl as far as possible to improve. Especially for the chain building this piece, you can consider the site update, immediately go to some high weight platform, such as forums, blogs, etc., publish articles or directory links to attract spiders to crawl.
Simple, superficial from their own point of view of the log logs to analyze a little thought, I hope to have some help.
This article by the QQ Individuality Signature Network (http://www.yy521.com/qq/) publishes, welcome everybody to reprint, reprint when please retain this link, thanks the cooperation!