Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
The site is a collection has been a big problem, then how to solve the site is included in the problem, first of all need to find the cause of the source, that is, IIS log. IIS log is a search engine crawling site of a record, through it can clearly see the spider crawling Web page total time, single page time, crawl depth, whether there are many repeated crawl and so on, see we need the right remedy, thus fundamentally solve the site included in the problem. Well, below from the following several aspects to do analysis.
First, over crawl URL problem
First look at whether there is an excessive crawl problem, this is very simple, the IIS log opened with DW, and then copy a URL to find all on it, or use some advanced IIS log analysis tools can be directly seen. If there are many URLs have been visited by spiders many times, then most likely because it is the home page or the number of clicks from the homepage of the more recent pages, the general adjustment is to reduce the number of these URL links. Overly-crawled URLs waste The spider's entire time crawling.
Second, whether there are duplicate content
After the first step, also can find a problem is duplicate content, if some URLs were crawled by spiders many times, then it is possible that this is the same content of different URLs, such as static and dynamic, such as some of the sort of the web, the function of the page to provide the content is not too much difference, But URLs may be different. Use a robots to block off.
Third, the spider did not crawl the URL
This is accomplished by scripting, find all the URLs of your website, then find out the URLs that spiders crawl over, and then compare them, find the URLs that have never been crawled by spiders, and then analyze the reasons why they don't have links, directories too deep, or too many URL parameters. Fix the cause, Continue to observe the collection in future.
Four, the overall internal chain structure how
Look at the overall structure of your site, your own click Test, look at the first page to the inner pages need several clicks, if you click to reach some of the inside page, then the crawler from the home page to the inside will need more time. So the more natural waste of time, so adjust the structure of the chain, more content through the chain to make spiders better crawl.
How fast is the access rate
Access speed is affected by a number of aspects, from the server to the background and to the front end to carefully see if there is room for optimization. Reduce the overall volume of the HTML code on the premise of ensuring the page effect. JS and CSS will be introduced separately, HTML exists alone, if carefully considered, the static URL is necessary, because too long dynamic URL will affect the transmission speed and so on.
The above five points is based on some of their own experience with the summary, if you have more methods to find and improve the site included, welcome to share more exchanges. This article from: Hemorrhoids folk prescription, url: http://www.cqtaihai.com, reproduced please keep the link, thank you!