Automatically monitor Web pages with Python and HttpWatch

Source: Internet
Author: User

In the Web Access quality monitoring, it is necessary to use the HttpWatch tool. HttpWatch can record all the details that take place during a Web page, including all the elements in the Web page, from the DNSLookup, the network connection to the first packet sending time, and so on (as shown), all with detailed records, providing a visual way for us to find the problem. Generally we are in the case of problems, we use it to analyze. But if it is used to keep track of the access of a Web page, and then record the storage, the data can provide a basic data for the analysis of the problem, which is also very meaningful. So httpwatch can achieve this demand. The answer is yes, it's easy to implement with Python. The following code uses Python to automatically read the page to be monitored from an external file, and to print some time features, of course, you can also achieve a more powerful function

External file format:

http://www.cites.com/

Http://www.cites2.com/page1.html

Http://www.cites3.com/page2.html



HttpWatch Default support C # with Ruby,python if you want to call it, need to use win32com This module, this need to install PYWIN32, can be downloaded to this address

http://sourceforge.net/projects/pywin32/files/pywin32/

here is the program implementation code:

#coding =utf-8
Import Win32com.client


# # #定义一个函数, which reads the external file to get the URL to be checked and returns it as a list
def getcitetocheck (filepath):
Input = open (filepath, ' R ')
cites = Input.readlines ()
Return to cites

def checkcite (CITES):
#创建一个HttpWatch实例, and open an IE process
Control = Win32com.client.Dispatch (' Httpwatch.controller ')
Plugin = control. Ie. New ()
Plugin. Log.enablefilter (False) #httpwatch的可以设置过滤某些条目, this is set to not filter
Plugin. Record () #激活httpwatch记录
I=1
For domain in cites:
url = domain.strip (' \ n ') #因为从文件里读的地址会带有换行符 \ n, so it needs to be removed, but it can be opened without removing it during testing.
Plugin. Gotourl (URL)
Control. Wait (plugin,-1)
#可以将日志记录到一个xml文件里去
Logfilename= ' D:\\log ' +str (i) + '. Xml '
Plugin. Log.exportxml (LogFileName)
#也可以直接读log的内容
Print (plugin. Log.Entries.Count)
for s in plugin. Log.entries: #plugin. Log.entries is a list, a list element is an object that corresponds to all the URL elements contained in a page
Print (S.url)
Print (S.time)
#s. timings.blocked returns a Timing object with three properties timing: Duration, Started, Valid, respectively
#Duration是指下载一个RUL元素所耗时间, started refers to the start time
#Timings含有Blocked, Cacheread, Connect, DNSLookup, Network, Receice, Send, TTFB, wait several objects
Print (' Blocked: ' +str (s.timings.blocked.duration))
Print (' Cacheread: ' +str (s.timings.cacheread.duration))
Print (' Connect: ' +str (s.timings.connect.duration))
Print (' DNSLookup: ' +str (s.timings.dnslookup.duration))
Print (' Network: ' +str (s.timings.network.duration))
Print (' Receive: ' +str (s.timings.receive.duration))
Print (' Send: ' +str (s.timings.send.duration))
Print (' TTFB: ' +str (s.timings.ttfb.duration))
Print (' Wait: ' +str (s.timings.wait.duration))
I=i+1
Plugin. Stop ()
Plugin. CloseBrowser ()
###########

cite_file= "Cite.txt"
cites = Getcitetocheck (cite_file)
########
Print (CITES)
For i in [1,2,3,4]:
Checkcite (CITES)


Automatically monitor Web pages with Python and HttpWatch

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.