An indispensable link in the analysis of the station: IIS log Analysis

Source: Internet
Author: User
Tags ftp iis log

For each optimization staff need to have a certain degree of analysis, analysis of the user's search behavior, analysis of the site data flow and so on. Only a reasonable analysis of these data can better formulate our optimization strategy. One of the indispensable in our analysis of the site is to analyze the search engine spiders crawling behavior. Search engine spiders are invisible to our eyes, how do we analyze its crawling behavior? We can analyze the IIS logs of our site.

One: Then from the analysis of IIS logs we can get information from our site

1: In the construction of the outside chain we know that every outside the chain is the search engine spiders enter the portal of our website, in this we can observe the log of the spider's visit, you can see from another perspective of our external chain can better attract spiders, in order to develop a more rational strategy for the construction of the chain.

2: The site of the space problem is a lot of webmaster thorny problem, the site can not open very likely to make our site overnight back before liberation. So for our site, how quickly the earliest understanding of the discovery problem. The same can be analyzed in the log search spider crawling situation, because the site space in addition to what the first response is the search engine spiders.

3: Through the log we can also analyze the spider for the page content crawling situation. Learn more about what the search engine likes about our site. We can according to these data in the content of the layout or fine-tuning, is the search engine more love our content.

Second: How to obtain the IIS log and the IIS log settings for our site

First, our site space needs to support the log downloads of the site. For this we buy the site space when you can first consult with the space business, whether to support the function, if supported, general log files in the weblog file, we can directly ftp to local. For the logging settings of IIS logs, I think generally if the site more content, more complex structure can be set to generate one hour, and less content can be set to update once a day, so that we can avoid the IIS log of the problem of large letter files.

Three: How to analyze the IIS log of our site

1: Split analysis of IIS log files

We use FTP to download the log locally, you can open the file through Notepad, while searching for the main search engine spider name, which Baidu's spider named Baiduspider, Google's spider named Googlebot. As shown in the following figure

  

Baidu Spider

  

Google Spider

We can analyze it in sections.

2012-04-13 06:47:10 refers to spiders crawling the specific date and time of the page

116.205.156.37 This address refers to the IP address where our site is located

Get represents an event, followed by this parameter refers to the spider crawling the page, and "/" represents the site's homepage.

220.125.51.130 This IP refers to the search engine spider's server IP address. Of course, this IP address is not necessarily the real search engine spider server address, because there may be some people in order to collect the content of your site, and impersonating search engine spiders to crawl your site, crawl your content. This is not a big impact, but if the frequency is high, it will consume the resources of the site. So how do we distinguish? The author gives a small way to himself first. We can open the Control Command window of our computer. Then enter the NSLOOKUP+IP address command. If it is true the spider will have its own spider server. We can screen the IP of the fake spider to handle. As shown in the following figure

  

Real spiders

  

Fake spider

2: The above we mentioned get parameters followed by the search engine Spider crawling page, we can according to this information analysis search spiders on our site what content more favor, and then to our site content to do the appropriate fine-tuning.

3: We can find through the log spider for our site crawling crawl is based on the weight of the page of the descending graded, General order for the home page, catalog pages and content pages.

Analysis is an essential part of our optimization effort, and there are a lot of available data around us. Reasonable use of these data is believed to help us a lot of optimization. This article by Taobao Crown Shop http://www.jgdq.org exclusive feeds, reprint please leave the link, thank you!



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.