1. Error examples of global declaration variables
I ran across this warning:
# !/usr/bin/env python2.3 ' XXX ' if __name__ ' __main__ ' : Global VAR ' yyy '
---output:./var.py:0: Syntaxwarning:name ' var ' is assigned-before global declaration----but, a little twiddle quiets t He warning, and I has no idea why:
# !/usr/bin/env python2.3 ' XXX ' def Set_var (): Global VAR ' yyy ' if __name__ ' __main__ ' : set_var ()
---No output.
Global is normally used within a function definition to allow it to assign to names defined outside the function (as in yo ur 2nd example). In your first example global was outside any function definition, and therefore not meaningful, as well as giving a syntaxw Arning.
Feed in 2.HTMLParser
The Htmlparser Feed () method invokes the
Handle_starttag (), Handle_data (), Handle_endtag () method
#!/usr/bin/env python#Coding=utf-8 fromHtmlentitydefsImportEntitydefs fromHtmlparserImportHtmlparserImportSYSclassTitleparser (htmlparser):def __init__(self): Self.title=' 'Self.readingtitle=0 htmlparser.__init__(self)defHandle_starttag (self, Tag, attrs):ifTag = ='title': Self.readingtitle= 1defHandle_data (self, data):ifSelf.readingtitle:self.title+=DatadefHandle_endtag (self, tag):ifTag = ='title': Self.readingtitle=0defhandle_entityref (self, name):ifEntitydefs.has_key (name): Self.handle_data (Entitydefs[name])Else: Self.handle_data ('&'+ name +';') defGetTitle (self):returnSelf.title FD= Open (sys.argv[1]) TP=Titleparser () tp.feed (Fd.read ())Print "Title is:", Tp.gettitle ()
3 Htmlparser
Htmlparser is a module that Python uses to parse HTML. It can analyze the HTML tags, data, and so on, is a simple way to deal with HTML. Htmlparser uses an event-driven pattern that, when Tmlparser finds a specific tag, invokes a user-defined function to notify the program to process it. Its primary user callback function is named after Handler_, which is the member function of Htmlparser. When we use it, we derive the new class from Htmlparser and redefine the functions that begin with Handler_.
handle_startendtag handling start and end tags
Handle_starttag processing start tags, such as <xx>
Handle_endtag processing end tags, such as </xx>
Handle_charref handles special strings, that is, & #开头的, which is usually the character represented by the inner code.
Handle_entityref handles some special characters, starting with &, such as
handle_data processing data, that is, the data in the middle of <xx>data</xx>
handle_comment processing Annotations
Handle_decl <! The beginning of the process, such as <! DOCTYPE HTML PUBLIC "-//w3c//dtd HTML 4.01 transitional//en"
HANDLE_PI to handle things like <?instruction>
[python] view plaincopyprint?
>>> Help (HTMLParser.HTMLParser.handle_endtag)
Help on method Handle_endtag in module Htmlparser:
Handle_endtag (self, Tag) unbound Htmlparser.htmlparser method
# Overridable--Handle end tag
>>> Help (HTMLParser.HTMLParser.handle_data)
Help on method Handle_data in module Htmlparser:
Handle_data (self, data) unbound Htmlparser.htmlparser method
# Overridable--Handle Data
>>> Help (HTMLParser.HTMLParser.handle_charref)
Help on method Handle_charref in module Htmlparser:
handle_charref (self, name) unbound Htmlparser.htmlparser method
# Overridable--Handle character reference
>>> Help (HTMLPARSER.HTMLPARSER.HANDLE_DECL)
Help on method handle_decl in module Htmlparser:
handle_decl (self, decl) unbound Htmlparser.htmlparser method
# Overridable--Handle Declaration
>>> Help (HTMLParser.HTMLParser.handle_startendtag)
Help on method Handle_startendtag in module Htmlparser:
Handle_startendtag (self, tag, attrs) unbound Htmlparser.htmlparser method
# Overridable--Finish processing of start+end tag: <tag.../>
>>> Help (HTMLParser.HTMLParser.handle_endtag)
Help on Method Handle_endtag in module Htmlparser:
Handle_endtag (self, Tag) unbound Htmlparser.htmlparser method
# Overridable--Handle end tag
>>> Help (HTMLParser.HTMLParser.handle_data)
Help on Method Handle_data in module Htmlparser:
Handle_data (self, data) unbound Htmlparser.htmlparser method
# Overridable--Handle data
>>> Help (HTMLParser.HTMLParser.handle_charref)
Help on Method Handle_charref in module Htmlparser:
Handle_charref (self, name) unbound Htmlparser.htmlparser method
# Overridable--Handle character reference
>>> Help (HTMLPARSER.HTMLPARSER.HANDLE_DECL)
Help on Method handle_decl in module Htmlparser:
Handle_decl (self, decl) unbound Htmlparser.htmlparser method
# Overridable--Handle declaration
>>> Help (HTMLParser.HTMLParser.handle_startendtag)
Help on Method Handle_startendtag in module Htmlparser:
Handle_startendtag (self, tag, attrs) unbound Htmlparser.htmlparser method
# Overridable--Finish processing of start+end tag: <tag.../>
Python Learning Note 5