Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
For website optimization, search engine log analysis is an essential piece, whether you are included in hundreds of small sites, or included millions of large and medium-sized sites, SEO to do well, must be a scientific log analysis, log is the site on the server all the events recorded, including user Access records, Search engine crawl records, for some large sites, daily logs have several G size, we can use the Linux command to separate, in large Web site log files are often confidential documents, the general people are not see, because from the log inside can analyze the trend of visitors, regional trends, We do not need so many SEO data, we just have to analyze the search engine crawl record this piece can be, so the large amount of data, if processed, will not be particularly big, and now the hard drive so cheap, storage log files can still be considered. So what data do we mainly analyze in the log?
1, each search engine's overall crawl quantity (and trend)
In the log file, a clear record of each search engine crawl volume, such as Baidu, Google, Sogou and other search engine crawl records, we can record, using DOS commands or Linux commands can be achieved, the search engine is collected by the amount of capture and the quality of the article to determine, When the quality of the article is unchanged, spiders grasp the larger, then the more collected will be, we are in the log analysis, we must clearly know the spider's grasp of the amount of the day is a situation, and every day to record, perhaps absolute value does not mean anything, we can go to see its trend, When the trend of grasping in a day is declining, we are going to find the reason.
2, record the search engine spider's do not repeat crawl quantity
In the last step, we analyzed the spider's crawl data, then we have to go to heavy, that is, the only search engine does not repeat the amount of capture, in fact, for inclusion, many pages as long as the crawl can be, but in the actual operation of the process, many pages are repeatedly crawled, Google's technology more advanced some, Repeat crawl rate may be lower, but Baidu and other search engines, repeat crawl rate is very high, you can see through the log analysis, one day if the capture of millions, may be tens of thousands of times are crawling home page, so many data you must go to analyze, when you analyze, you will know the seriousness of the problem.
3, each directory, each search engine crawl quantity
The top two steps to the overall crawl amount, do not repeat the crawl volume recorded, and then we have to analyze each search engine for each directory of the crawl situation is what, so conducive to block optimization, for example, when your site traffic is rising, you can know which directory of traffic up, and then push down, To see which directory fetching volume has risen, which catalog's crawl quantity drops, why drops, all may carry on the analysis, then carries on the appropriate link structure adjustment in the website, for example uses the Nofollow label and so on.
4, Statistics search engine crawl status code
When the search engine crawls your page, not only crawl your content, but also there will be a crawl return code, these return codes we have to record, especially some like 301, 404, 500 and so on these status codes, we from these status code we find some of the potential problems of the site, For example, why are there a lot of 404 pages, is the program reason, or search engine in the grasp of the chain when the extraction error, in fact, we can see this data in Google Administrator tool, inside also will prompt you the wrong 404 page appears where, for some 301 status code we also have to pay attention to, See if these 301 jumps as we'd like, the site should be as little as possible with the jump, the page in the jump, will often extend the load time of the page, the most common 301 may be the URL without "/" Jump to take "/" the situation, we are in the site, Try to avoid such a situation.
5, Statistics search engine spiders to the number of times, to the time
We can use some log analysis tools, set a standard, such as light years log analysis tool, you can count the number of spiders per search engine every day, a total of a day in our site to stay for how long, there is no IP spider 24 hours a day in our site constantly crawl, such a spider more the better, is often the performance of your site weight improvement. Such data can be recorded every day, in a certain period of time, comparative analysis, to see whether the time to stay is increased, the number of times is not increased, so you can determine whether the weight of the site is rising or falling.
Of course you can see from the log the SEO guidance data There are many, here I temporarily do list so many, hope can play a role, we can think down, for extension, in the usual SEO data analysis work, must develop the habit of analysis log, usually if there is time, Can look at the log file, for example can go to look at the search engine spider on your page crawl trajectory, to see what the law, which for your future SEO work is very helpful. Above content by Www.jieyitongcy.com Jie Yi Tong automatic recharge software in ADMIN5 first, reproduced please keep the URL, thank you!