Wang Dajun: When did the search engine spider come to your website?

Keywords Search engine website log

Site rankings, included are often concerned about search engines, we often talk to spiders crawl, then how to determine whether spiders have been to your site?

Usually the log of the website will tell us this information, here take Wang Dajun Network Marketing Blog for example. The author uses the fictitious space, in the author website root directory log folder inside has some ex date. log text file. Here Wang Dajun remind you that if there is no file in your log folder, you may need to download the log file to the FTP space in the background of the virtual space, which is to download to the log folder of your website root directory. We choose Ex101116.log Download to local, this file is Wang Dajun web Marketing Blog website November 16, 2010 log, with Notepad open, in the log file search "Spider" The word, that is, spider meaning. You may find the following code:

Google Spider: Googlebot

Baidu Spider: Baiduspider

Yahoo Spider: slurp

Soso Spider: Sosospider

MSN Spider: msnbot

Youdao Spider: Yodaobot and Outfoxbot

Sogou Spider: Sougouspider

Of course, the premise is that these spiders have come to your site, otherwise there is no such code.

We choose a Baidu spider code "Baiduspider" record, look at the code inside:

2010-11-15 18:18:10 get/post/5.html-80- baiduspider+ (+ spider.htm)-200 ....

Let me explain this code:

1, 2010-11-15 18:18:10 is Baidu spider to date and time.

2, get/post/5.html is Baidu spider access to the page get express meaning.

3, 80 is the port

4, is Baidu Spider's IP address

5, baiduspider+ (+ refers to the Baidu Spider

6, 200 means crawl success, this is the Baidu spider crawling back after the code.

There are other code:

2XX success

200 normal;

201 normal;

202 normal; Accepted for processing, but processing has not yet completed.

203 normal; Partial information-the information returned is only part of the message.

204 normal; No response-received request, but no information to echo.

3xx redirect

301 Moved-The requested data has a new location and the change is permanent.

302 found-The requested data has a different URI temporarily.

303 See other-the response to the request can be found under another URI, and the response should be retrieved using the Get method.

304 unmodified-The document is not modified as expected.

305 Use proxy-the requested resource must be accessed through the agent provided in the Location field.

306 unused-is no longer in use;

Error occurred in 4xx client

400 Error request-There is a syntax problem in the request or cannot satisfy the request.

401 Unauthorized-The client is not authorized to access the data.

402 Payment required-Indicates that the billing system is valid.

403 prohibited-access is not required even if authorized.

404 Not Found-the server could not find the given resource;

407 Proxy authentication Request-The client must first use the proxy authentication itself.

410 The requested Web page does not exist (permanent);

415 Media type not supported-server denial of Service request because the format of the request entity is not supported.

Error occurred in 5xx server

500 INTERNAL Error-The server cannot complete the request because of an unexpected condition.

501 not executed-the server does not support the requested tool.

502 Error Gateway-server received an invalid response from the upstream server.

503 cannot get service-the server cannot process the request due to temporary overload or maintenance.

About Log Analysis Today we are here, if you have other views, welcome to exchange discussions.

