Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
Hello everyone, I am Wei Dong! For many webmasters, the analysis of ISS logs is important, but it is rarely valued. Usually the basic data of the website can be found through GA or Baidu statistics, webmaster tools, but some things want to pass these commonly used tools can not be full name linked to the details of the site. For example, the way spiders crawl. Through the ISS log, we can see how the site is crawled by search engine spiders. So why do we have to do Web log analysis. You may find your site's internal pages, usually, search engines are not all can be included in your site content page. On the one hand, there may be some relationship with the way our spiders crawl.
ISS Log what can we do?
1, can indirectly analyze the site's external chain effect
2, our space is good or bad
3, the spider to our page which page prefers which pages do not like?
4, when the Spider frequently visit our website, when exactly we need to update the content of the site
Usually when we look at our ISS log, we usually look at the number of 200 states, then the ISS log on the point of function, it is not, the following through some aspects to explain why seoer to form the habit of reading the log.
What are the important features of ISS log?
1. Through the ISS log we can indirectly find spiders crawling to your site when the frequency, the number of the chain can indirectly reflect your site's external chain to do the success of the degree. Spiders crawl through the external links to the content of your site, you can find through the ISS log spiders crawl way and trajectory.
2.iss Log Update frequency and the content of the site update frequency has a certain relationship, site updates and site fine-tuning also has a certain relationship! These we can also see through the site's ISS log
3. We can use the ISS log to find out where we have problems in our space. These things can be early warning. Through ISS log Analysis we can indirectly analyze the stability of a website space, we can be very good
Find out which space business is good!
4. Through spiders we can find out which pages of spiders are frequently crawled, these pages are frequently crawled for our bandwidth is seriously wasted. So we have to make a good analysis of these frequent pages are often the pages are frequently crawled, and these pages for you do not have much effect, then we can through the website of the robots, shielding these pages are frequently crawled.
Second, how to download the log and log settings note?
1. Homepage Our space log log analysis file will appear this ISS log file from downloading to local via FTP. Then we have some common log analysis tools to find some of the rules, we recommend the use of light years log analysis tool
2. For large web site an ISS log will be very large, when the tool opened will cause panic, for small sites we will find that it is OK, but the big site, it is really difficult, but we can use the real-time download ISS log. will be very good to solve these problems, but at present a lot of information issued by the log analysis tools can solve these problems, specific problems specific analysis
Third, the specific analysis of the ISS log.
1. The log suffix name is log we open with Notepad, choose the format of the automatic wrapping so that looks convenient, for Baiduspider and Googlebot
Baidu Spider
2012-03-13 00:47:10 w3svc177 116.255.169.37 get/–80–220.181.51.144 baiduspider-favo+ (+ baidu/search/spider) 200 0 0 15256 197 265
Google Robot
2012-03-13 08:18:48 w3svc177 116.255.169.37 get/robots.txt–80–222.186.24.26 googlebot/2.1+ (+ Google/bot) 200 0 0 985 200 31
We'll explain it in sections.
2012-03-13 00:47:10 When did spiders visit your site?
w3svc177 This is the machine code.
116.255.169.37 This IP address is the IP address of the server
Get Rep Event
Get behind is Spider crawl Site page, Slash on behalf of home
80 is the meaning of the port
220.181.51.144 This IP is the spider's IP, here to tell you a true and false Baidu Spider method, we computer click Start to run input cmd Open command prompt, input nslookup space plus spider IP click Return, Generally true Baidu spiders have their own server IP and fake spiders do not.
If sometimes you find your space in many cases and pretend to be Baidu Spider spiders to visit your site frequently, then this time we will be good to shield this IP, excessive to collect the content of our website.
200 0 0 200 reaction spider crawl Normal
197 265 The last two digits represent the number of bytes of data accessed and downloaded.
2. We analyze the time first look at the status code 200 on behalf of the success of the download, 304 on behalf of the page has not been modified, 500 on behalf of the server timeout, these things can go to search, the Internet has a detailed description.
3. We can use spiders like those pages, to estimate what the spider really like content, later we can determine how we should write our site content.
4. Sometimes we can find spiders crawling to our site when the problem arises, the right remedy!
5. We crawl through the frequency of spiders can be very good analysis of those time period, time band spiders often come, and then we rush in this time before the opposite side of our website update, so that we can let the search engine know our site is the latest
6. Spiders for our page crawl is graded, is based on the weight descending, the general order of the Head page, catalog page, inside page.
7. Different IP spiders They crawl frequency is not the same
As a qualified Seoer personnel to form the habit of watching the log, in the log can be very clear to see all of our site!
Please keep http://www.weidongdong.com/seoer-kan-rizhi.html.