標籤:
1.關於global聲明變數的錯誤例子
I ran across this warning:
#!/usr/bin/env python2.3VAR = ‘xxx‘if __name__ == ‘__main__‘: global VAR VAR = ‘yyy‘
---OUTPUT:./var.py:0: SyntaxWarning: name ‘VAR‘ is assigned to before global declaration----But, a little twiddle quiets the warning, and I have no idea why:
#!/usr/bin/env python2.3VAR = ‘xxx‘def set_var(): global VAR VAR = ‘yyy‘ if __name__ == ‘__main__‘: set_var()
---No output.
Global is normally used within a function definition to allow it to assign to names defined outside the function (as in your 2nd example). In your first example global is outside any function definition, and therefore not meaningful, as well as giving a SyntaxWarning.
2.HTMLParser中feed
HTMLParser的feed()方法會調用
handle_starttag(), handle_data(), handle_endtag()方法
#! /usr/bin/env python #coding=utf-8 from htmlentitydefs import entitydefs from HTMLParser import HTMLParser import sys class TitleParser(HTMLParser): def __init__(self): self.title = ‘ ‘ self.readingtitle = 0 HTMLParser.__init__(self) def handle_starttag(self, tag, attrs): if tag == ‘title‘: self.readingtitle = 1 def handle_data(self, data): if self.readingtitle: self.title += data def handle_endtag(self, tag): if tag == ‘title‘: self.readingtitle = 0 def handle_entityref(self, name): if entitydefs.has_key(name): self.handle_data(entitydefs[name]) else: self.handle_data(‘&‘ + name + ‘;‘) def gettitle(self): return self.title fd = open(sys.argv[1]) tp = TitleParser() tp.feed(fd.read()) print "Title is:", tp.gettitle()
3 HTMLParser
HTMLParser是python用來解析html的模組。它可以分析出html裡面的標籤、資料等等,是一種處理html的簡便途徑。 HTMLParser採用的是一種事件驅動的模式,當TMLParser找到一個特定的標記時,它會去調用一個使用者定義的函數,以此來通知程式處理。它 主要的使用者回呼函數的命名都是以handler_開頭的,都是HTMLParser的成員函數。當我們使用時,就從HTMLParser派生出新的類,然 後重新定義這幾個以handler_開頭的函數即可。
handle_startendtag 處理開始標籤和結束標籤
handle_starttag 處理開始標籤,比如<xx>
handle_endtag 處理結束標籤,比如</xx>
handle_charref 處理特殊字元串,就是以&#開頭的,一般是內碼錶示的字元
handle_entityref 處理一些特殊字元,以&開頭的,比如
handle_data 處理資料,就是<xx>data</xx>中間的那些資料
handle_comment 處理注釋
handle_decl 處理<!開頭的,比如<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
handle_pi 處理形如<?instruction>的東西
[python] view plaincopyprint?
>>> help(HTMLParser.HTMLParser.handle_endtag)
Help on method handle_endtag in module HTMLParser:
handle_endtag(self, tag) unbound HTMLParser.HTMLParser method
# Overridable -- handle end tag
>>> help(HTMLParser.HTMLParser.handle_data)
Help on method handle_data in module HTMLParser:
handle_data(self, data) unbound HTMLParser.HTMLParser method
# Overridable -- handle data
>>> help(HTMLParser.HTMLParser.handle_charref)
Help on method handle_charref in module HTMLParser:
handle_charref(self, name) unbound HTMLParser.HTMLParser method
# Overridable -- handle character reference
>>> help(HTMLParser.HTMLParser.handle_decl)
Help on method handle_decl in module HTMLParser:
handle_decl(self, decl) unbound HTMLParser.HTMLParser method
# Overridable -- handle declaration
>>> help(HTMLParser.HTMLParser.handle_startendtag)
Help on method handle_startendtag in module HTMLParser:
handle_startendtag(self, tag, attrs) unbound HTMLParser.HTMLParser method
# Overridable -- finish processing of start+end tag: <tag.../>
>>> help(HTMLParser.HTMLParser.handle_endtag)
Help on method handle_endtag in module HTMLParser:
handle_endtag(self, tag) unbound HTMLParser.HTMLParser method
# Overridable -- handle end tag
>>> help(HTMLParser.HTMLParser.handle_data)
Help on method handle_data in module HTMLParser:
handle_data(self, data) unbound HTMLParser.HTMLParser method
# Overridable -- handle data
>>> help(HTMLParser.HTMLParser.handle_charref)
Help on method handle_charref in module HTMLParser:
handle_charref(self, name) unbound HTMLParser.HTMLParser method
# Overridable -- handle character reference
>>> help(HTMLParser.HTMLParser.handle_decl)
Help on method handle_decl in module HTMLParser:
handle_decl(self, decl) unbound HTMLParser.HTMLParser method
# Overridable -- handle declaration
>>> help(HTMLParser.HTMLParser.handle_startendtag)
Help on method handle_startendtag in module HTMLParser:
handle_startendtag(self, tag, attrs) unbound HTMLParser.HTMLParser method
# Overridable -- finish processing of start+end tag: <tag.../>
Python學習筆記5