Novice webmaster must look at the IIS log analysis

Source: Internet
Author: User

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

Baidu now changes the rules very quickly, the mechanism is also more and more perfect, many yesterday there are rankings of the site, today is not, a lot of yesterday did not rank the station, today appeared on the home page, so many do station webmaster feel is the head big, before doing SEO are based on the rules of Baidu, And now Baidu's change completely let the compatriots of SEO to touch the mind, in fact, this is not difficult, careful analysis of IIS logs, will still find a lot of regular, as a new station builders, I take their own site to illustrate how to analyze the IIS log spider crawling situation.

Suggest everyone before the line must prohibit Baidu spider crawling, because to do a lot of site testing, Baidu Spider Open before must do a good check, must check dead chain, many do the station is used by other people's program, there is certainly a lot of software links, the change must be modified, When you really think that the site is no problem, it can be released to the Baidu Spider. My station is April 1 only on the line, before this for half a month of debugging,

  

You can look at this picture, before April 1 is completely without spiders. I was April 1 to Baidu Spider Open, now Baidu to the new station is given a certain weight, so as long as you check the site can be very quickly after the collection, some people must ask me before March or a spider crawling, in this said, that is not spider crawling, I use third-party tools to query , this is the IIS log, as long as you operate on the site will have records.

  

You can look at this picture. On the April 1 when the spider came very much, there is a need to explain the spider some specific IP analysis, many people certainly have questions, how do you know is Baidu spider? This aspect in fact more on the Internet to find information, you can find a lot of aspects related, of course, many are reproduced, not very perfect. I combine my station log to give you detailed explanation of the spider's IP.

Based on the different IP we can analyze the site is what kind of state. The following is according to my IIS diary Baidu spider IP as an example:

123.125.68.* This spider often come, others come to less, said that the website may want to enter the sandbox, or the person down right.

220.181.68.* every day this IP segment only increase is likely to go into the sandbox or K station.

220.181.7.*, 123.125.66.* on behalf of Baidu Spider IP visit, ready to grab your things.

121.14.89.* this IP segment as a time to pass the new station.

203.208.60.* This IP segment appears in the new station and the site has abnormal phenomenon.

210.72.225.* this IP segment non-stop patrol stations.

125.90.88.* Guangdong Maoming City Telecom also belong to Baidu Spider IP is mainly caused by components, is a new line station more, there is the use of webmaster tools, or SEO comprehensive detection caused.

220.181.108.95 This is Baidu crawl home dedicated IP, if 220.181.108 paragraph, basically your site will be overnight snapshots, absolutely wrong, I promise.

220.181.108.92 ditto 98% Crawl home page, may also crawl other (not refer to the inside page) paragraph 220.181 belong to the weight of this section of the IP section of the article or the first page of the basic 24 hours released.

123.125.71.106 crawl inside the page included, the weight is low, crawl through this section of the Inner page article will not soon put out, because not original or collection article.

220.181.108.91 is integrated, the main crawl home page and internal pages or other, belong to the weight of IP section, climbed the article or the first page basic 24 hours out.

220.181.108.75 key Crawl update article inside page reach 90%,8% crawl home, 2% other. Weight IP segment, crawled article or homepage basic 24 hours release.

220.181.108.86 dedicated crawl Home IP weight segment, the general Return code is 304 0 0 represents not updated.

123.125.71.95 crawl inside the page included, the weight is low, crawl through this section of the Inner page article will not soon put out, because not original or collection article.

123.125.71.97 crawl inside the page included, the weight is low, crawl through this section of the Inner page article will not soon put out, because not original or collection article.

220.181.108.89 dedicated crawl Home IP weight segment, the general Return code is 304 0 0 represents not updated.

220.181.108.94 dedicated crawl Home IP weight segment, the general Return code is 304 0 0 represents not updated.

220.181.108.97 dedicated crawl Home IP weight segment, the general Return code is 304 0 0 represents not updated.

220.181.108.80 dedicated crawl Home IP weight segment, the general Return code is 304 0 0 represents not updated.

220.181.108.77 Special Scratch Home IP weight segment, the general Return code is 304 0 0 represents not updated.

123.125.71.117 crawl inside the page included, the weight is low, crawl through this section of the Inner page article will not soon put out, because not original or collection article.

220.181.108.83 dedicated crawl Home IP weight segment, the general Return code is 304 0 0 represents not updated.

Note: The above IP mantissa there are many, but the same section of the 123.125.71.* IP representative crawl inside the page included in the weight is relatively low. may be because you collect articles or spell articles temporarily included but not

Release it. (meaning to be determined).

220.181.108.* segment IP is the main crawl home accounted for 80%, the internal page accounted for 30%, this climb over the article or home page, absolutely 24 hours out and overnight snapshots, this I can guarantee!

Generally successful crawl return code are 200 0 0 return 304 0 0 on behalf of the site did not update, spiders have come, if it is 200 0 64 Don't worry this is not K station, may be the site is dynamic,

So the return is this code.

Take a closer look at these IP. Combined with my Site map can carefully see if there are these IP segments, do the station must form a good habit of query log, so you can find the station problem, can be targeted, I carried out on the April 1 of the spider behavior, and the site on April 4 has a very good rankings, Mianyang Fu Lok Net, The network is ranked first, which is not the same as my daily focus on IIS logs. You can look at the following figure:

  

Baidu Home page is continuously increasing every day.

  

These IP are Baidu spider to the weight of IP. If your site has a lot of such spiders crawling, so congratulations, your station will soon be ranked, and the snapshot is definitely the latest snapshot, in this said the latest snapshot means, the normal snapshot is the previous day's snapshot, and the latest snapshot 8630.html "> sometimes will appear in the snapshot, This and do the station inside the article has a very big relationship, I do stand in this 20 days, never sent out of the link outside the station, has been doing in the station article. So the rankings rise very quickly, Baidu now attaches great importance to experience degree.

  

Some people will ask this kind of IP is to do what use, this is Baidu to your station already concern, just still in the test period, the weight is relatively low, do not worry, this kind of IP more time is appearing in the collection of internal articles, Baidu is not as smart as Google, so you can not ask Baidu to your article for seconds to collect, Even if you are a completely original article, Baidu will still be included in the first warehousing, and then generally 7 days later will be released.

To sum up, everyone in the station must often go to see IIS log, to analyze problems, such as my station there are many 404 of errors, I still looking for the reason.

  

This kind of 404 error lets the person only slowly to look for the reason, I also do not know why suddenly will be very big, then suddenly very small. This aspect also needs to communicate with a lot of master.

This is also my little experience to do the station, take out to share with you, everyone is in order to rank just day and night before sitting to the computer, are very hard, I hope we have a lot of exchanges and common progress, this article by the Mianyang Fu Lok network http:// Www.myfule.com Sky Nebula Written, if you need to reprint please retain the source, respect for the original, common progress!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.