Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
Many people are saying that the original, every day to collect, and then modified, sent to their own site, but do not know you wood has thought about a problem, we send the article has no meaning, why not be included, spiders how to think we send the article has no meaning.
Before that, we need to know where the server log is looking, and the server log is typically in the FTP space under a log folder and is the end of the Tor suffix. After the download, after decompression we get two files, will have the size file suffix name changed to domain name and then open. We're going to get a bunch of text characters, don't worry about where we look, the analysis of our strips. But first we need to know what those IP means. I have listed below.
Baiduspider=baiduspider Spider
Googlebot=googlebot Spider
Sogou=sogou Spider
Yahoo=yahoo Spider
360spider=360spider Spider
123.125.68.=123.125.68 Sand Box
220.181.68.=220.181.68 Sand Box
220.181.7.=220.181.7 Ready to crawl
123.125.66.=123.125.66 Ready to crawl
Investigation of 121.14.89.=121.14.89 new station
203.208.60.=203.208.60 Site exception
210.72.225.=210.72.225 Patrol
123.125.71.106=123.125.71.106 Low Weight
123.125.71.95=123.125.71.95 Low Weight
123.125.71.97=123.125.71.97 Low Weight
123.125.71.117=123.125.71.117 Low Weight
123.125.71.=123.125.71 Low Weight Summary
220.181.108.95=220.181.108.95 Alternate Day Snapshots
220.181.108.92=220.181.108.92 Weight Crawl
220.181.108.91=220.181.108.91 Comprehensive Weight
220.181.108.75=220.181.108.75 inside page Weight
220.181.108.86=220.181.108.86 Weight Home
220.181.108.89=220.181.108.89 Weight Home
220.181.108.94=220.181.108.94 Weight Home
220.181.108.97=220.181.108.97 Weight Home
220.181.108.80=220.181.108.80 Weight Home
220.181.108.77=220.181.108.77 Weight Home
220.181.108.83=220.181.108.83 Weight Home
220.181.108.=220.181.108 Weight Spider Summary
One of our analysis
123.125.68.45--[24/jun/2014 15:12:04 +0800] ' get/xingyexinwen/129.html http/1.1 ' 9107 '-' mozilla/5.0 (Windows U ; Windows NT 5.1; ZH; rv:1.9.2.12) gecko/20101026 firefox/3.6.12 qqdownload/1.7 '
Based on the definition above:
123.125.68.45--[24/jun/2014 15:12:04 +0800] ' get/xingyexinwen/129.html this sentence means Google spider in 3721.html ">2014 year June 24, It's 15:12:04. My site crawls a URL of xingyexinwen/129.html and puts it in a sandbox (123.125.68.45 represents a sandbox spider.) )
http/1.1 ' 9107 '-' mozilla/5.0 (Windows; U; Windows NT 5.1; ZH; rv:1.9.2.12) ' As for this, of which 200 is the return code, the representative is normal, of course, return code there are 301, 404, 302, 304 and so on, you can check the meaning of the representative. Other representatives are its access to the computer's properties, the Swindows operating system, and so on.
Do not know to see these, you dizzy not, of course, we usually use the Exec form to analyze the log, the TXT file into the table, can help us more intuitive to analyze the log, know what spiders crawl on our site, and then define what content. Of course, log analysis software can also be used to analyze.
About more, can also search Moon Bug blog www.croelhui.com, need me to introduce what, you are welcome to give me a message.