Web log, is a server-side automatically generated a text record, detailed records of the visit details of the site, as a webmaster of you, if you need to see access to statistical data, That with 51.la or Baidu statistical tools can be, but if you want to see the search engine spiders are on time to crawl their own site, it is necessary to learn to view the Web site log files. Combined with the new launch of the sales studio Chengdu Ming Yang Technology website as an example, we do an introduction:

First, using the FTP tool to log on to the server, generally under the server root directory there is a logs folder, which is the Web site log, of course, different server types, log file folder name and I introduce the same, but it does not matter, the log file extension is log.


When you enter the log folder, you will find that the log file is saved for one file per day of access:


OK, my server only keeps log files for the last three days, really stingy, I also used a foreign server, people are monthly records, and the end of the month will be the month of the log packaging for download, as long as you do not delete, the log file will always exist, this is called personalization, but there is no way, We can only use domestic servers.

OK, complained a few, casually download the log file for a day, open the log file with the Windows Text tool, and see a bunch of characters that resemble code, and I wish the file size to open at different speeds:


Note that there is a small part of the background, I use CTRL search function to find Baiduspider, well, why to find Baiduspider, here to popularize a little knowledge about search engine spiders, the major search engine spiders have a name:

Baidu is called Baiduspider;

Google's name is Googlebot;

Microsoft's name Bingbot;

Sohu's called Sogou web spider;

Tencent's name is Sosospider;

Because the main domestic Baidu as the optimization object, we look at Baidu Spider crawling Records of information analysis, in the log to find a random Baidu Spider information:[07/sep/2012:19:16:21 +0800] "get/http/1.1" 5374 "" mozilla/5.0 (compatible; baiduspider/2.0; +http://www.baidu.com/search/spider.html) "

How do you interpret this information? Spider ip--"Access Time" "Get path" HTTP feedback value 200 feedback byte number 535,700 degrees spider mark.

All right, here. Getting the path and HTTP feedback value is very important information, 200 for normal read, read 5,374 bytes. Let's analyze another record:[07/sep/2012:09:54:15 +0800] "get/product/disp.php?id=93 http/1.1" "" "249 (mozilla/5.0 E; baiduspider/2.0; +http://www.baidu.com/search/spider.html) "

Seriously look to get the path of this item, because my site is the old domain name, the original owner was included in the path/product/disp.php?id=93, Baidu Spider also crawling, the results of my new site will certainly not have this information, due to HTTP feedback 301, and 301 representatives have moved- The requested data has a new location and the change is permanent. In fact, this is a good thing for me, spiders climb does not work, know that this record has been invalidated, slowly will be from the collection of Baidu database deleted. Now Baidu is reviewing my new site, crawling only 23 times a day, but also very good.

Well, we can not analogy learn to see other search engine spiders crawl records? Next time, the flying Sales studio will focus again to share the value of HTTP feedback, we can learn through this feedback value of their own site health Oh, very important.

Of course, some friends will say why not use some log viewing tools, manual inspection time and effort, yes, there are some good tools to provide more convenient operation, today's sales studio The content is designed to teach you how to see your site's logs in the most elementary way. I hope the above content will help you a little.

Article by the flying Sales Studio original, reproduced please indicate the Chengdu Ming Yang Technology http://www.cdmingyang.com/

