How to analyze website log by novice Webmaster

Blog should say that every webmaster should know how to view, webmaster can view the status of the site through the log, is the necessary skills for each webmaster. Do not really build the station when you see other people's website log to see a series of code, all kinds of do not understand, all kinds of mysterious, feel the webmaster is really god ah, so complex code can understand, was really too worship. But when they build their own station to learn all aspects of the Web site knowledge, only to find the site log to see this basic skills is really very simple, as long as you know the search engine spider name and some return status code on the line.

Before mentioned the current several large search engine spider's name, today say how to view the website log.

The first thing you need to do is download the Web log before you view the log. Now the general virtual space Provider will provide "Access log download" This function, before how to choose a good virtual space when it is best to have the virtual space with log download function convenient for our webmaster to see also save a lot of trouble.

Here I first copy a section of the Web site space log code for you to see.

2012-02-08 09:05:25 get/default.asp-- http/1.1 mozilla/5.0+ (compatible;+baiduspider/2.0;++http:// 34499 421

This is a line of log code, we should know that Baiduspide is the name of the spider Baidu, we can see from this line of code in Baidu Spiders crawling information on the site.

2012-02-08 09:05:25 get/default.asp, this everyone a look should understand, meaning is Baiduspider the name of the user is Baidu spider in 2012-02-08 09:05:25 This time crawls the homepage of our website namely Default.asp page, this get is a search engine's crawl action., this representative is the IP address of the visiting user, in this log this IP address represents Baidu Spider's IP antecedents. I believe that the experience of the establishment of the station's friends on this IP should be very familiar with.

http/1.1 This code represents a Hypertext transfer protocol, people who contact the network should know that the transmission of information on the network is required through a certain network protocol, this http/1.1 is a number of agreements, we do not have to pay attention to this, of course, if interested can also be carefully understood.

mozilla/5.0+ (compatible;+baiduspider/2.0;++ code mozilla/ 5.0 means that the user is using the browser is mozilla/5.0, and the information in the new station in parentheses if there should be able to let the webmaster excited for a while, it is Baidu Spider, can be said to be engaged in Baidu to optimize the owners of the food and clothing parents Ah, it is love and hate ah.

200 34499 421 This Code "200" represents the spider crawling back to the status code, 200 is the meaning of crawling success, 34499 represents the size of crawling Web pages.

The emphasis here should be on the meaning of the returned status code:

In fact, a simple summary of the 2** code represents the normal

200 normal;

201 normal;

202 normal; Accepted for processing, but processing has not yet completed.

203 normal; Partial information-the information returned is only part of the message.

204 normal; No response-received request, but no information to echo.

3** 's code represents a redirect

301 Moved-The requested data has a new location and the change is permanent.

302 found-The requested data has a different URI temporarily.

303 See other-the response to the request can be found under another URI, and the response should be retrieved using the Get method.

304 unmodified-The document is not modified as expected.

305 Use proxy-the requested resource must be accessed through the agent provided in the Location field.

306 unused-is no longer in use;

The 4xx code represents the error that occurred in the client

400 Error request-There is a syntax problem in the request or cannot satisfy the request.

401 Unauthorized-The client is not authorized to access the data.

402 Payment required-Indicates that the billing system is valid.

403 prohibited-access is not required even if authorized.

404 Not Found-the server could not find the given resource;

407 Proxy authentication Request-The client must first use the proxy authentication itself.

415 Media type not supported-server denial of Service request because the format of the request entity is not supported.

5XX code represents an error in the server

500 INTERNAL Error-The server cannot complete the request because of an unexpected condition.

501 not executed-the server does not support the requested tool.

502 Error Gateway-server received an invalid response from the upstream server.

503 cannot get service-server cannot process due to temporary overload or maintenance

In fact, we usually analyze the site log the most important point is to know what these status codes mean, so as to understand the site page access.

Now there are a lot of Web log analysis software can directly analyze the Web site logs do not bother to visit the webmaster, but personally feel like this webmaster Basic skills novice or learn better, in the event of the day the software goes wrong or there are other circumstances will not be analyzed.

These are the basic skills that stationmaster needs to master

