There are a lot of tools for analyzing IIS on the Internet, but I didn't meet my requirements, and generally I can only query the number of spiders crawling in the IIS log. Here is a relatively simple and very practical way, through some simple Excel formula to make a series of data, such as time interval, crawling page, return status code, URL parameters, spider type, spider IP, etc., through the above data can be used to troubleshoot the site, correction. You must first have your own server or be able to view the permissions of the IIS log, download the IIS log files from space to the local server via FTP to get IIS logs: Open IIS, click to query website > right > Properties > Site tab > Properties > You can see that some space services will place the log files in the root directory of the Web site, as shown in the figure if space. If not, you can obtain it from the service provider.
Then find the appropriate folder based on the path and then you can see that there are a lot of. log files in the folder, which are then downloaded locally via FTP.
If the file is too large to operate may not be, you can use UltraEdit open, screen the data you want to get (specific download a study). File is not very large you can use Notepad to open and copy to Excel.
Then delete the first 4 lines, select Column A, click Excel data > Break down > Split symbols
Next > Other > Input blanks > Next > Complete. So the first step is complete.
Then select the A1 column > right > Insert and delete the C,d,e,i column. Enter the first line: date, time, Web page, parameter, port, IP, spider, Status code another description of the parameters, this is the dynamic page face question mark (?) The back part. Http://www.huiwang.org/jiaju/chufang/5309_3.html the parameter value after this path is 3, then the combination is really the URL is http://www.huiwang.org/jiaju/ Chufang/5309_3.html?3 so that the spider can also be able to identify parameters, and some sites put ads often with the following parameters on the statistics, but after fetching parameters will be removed. So try not to use the URL of this class on the content page.
Select G > Data > Filter > Click g Column Arrow > Text filter > contains
Enter Baidupider Click OK. You can see all the data Baidu spider access to such words roughly the data has been presented, if you look at Google in the filter when the input Googlebot can be. Then make a simple PivotTable report for easy analysis. One order: Insert > Pivot table > Pivot table > OK on the right click on the order of the page, spider, Time. Then click on the small triangle behind the spider
Click Tab filter > Include > Input baidupider. Can filter out the end of the page to see the spider crawl time detailed data, of course, can be filtered according to other combinations of various types of data, will not be demonstrated.
Luigi ' s blog original, reprint please specify http://www.itemseo.com/432.html thank you