Python network programming test-a preliminary study of Parser

Source: Internet
Author: User

HTML or XHTML may be the most commonly used programming language for every computer user. I'm sighing google, bing, baidu, and so on (by the way, my instructor's antu search) when search engines are powerful, have you ever thought about writing one by yourself?
The code below is just a test. There are a lot of problems from the "surface" or from the "internal" for your reference only.
Code to obtain the image URL from the webpage Information
[Python]
Class ImgParser (HTMLParser ):
Def _ init _ (self ):
Self. tag =''
Self. attrs =''
Self. readingtitle = False
HTMLParser. _ init _ (self)
 
Def handle_starttag (self, tag, attrs ):
If tag = 'img ':
Self. readingtitle = True
For name, value in attrs:
Print (value)
 
Def handle_data (self, data ):
If self. readingtitle = True:
Self. tag + = data
 
Def handle_endtag (self, tag ):
If tag = 'img ':
Self. readingtitle = False

Here, the HTMLParser module is still worth mentioning (an interesting module ):
HTMLParser itself does not provide too many functions. If you need to parse HTML, You can inherit HTMLParser. For some specific function functions, similar to the virtual function in C ++ (personal understanding), it defines the subtle processing of elements in HTML:
Handle_starttag (self, tag, attrs): process information in the start tag <tag attrs = "... "> data </tag>, where attrs (attribute) is stored in the list
Handle_endtag (self, tag): process the information in the end tag <tag attrs = "..."> data </tag>
Handle_data (self, data): Process Element data Information <tag attrs = "..."> data </tag>
Test.html:
[Html]
<! -- Basic Title parsing -->
<HTML>
<HEAD>
<TITLE>
Document Title
</TITLE>
</HEAD>
<BODY id = "1" name = "this is a body">
hoho </img>
Here is the test
</BODY>
</HTML>

Of course, it's a good parser ??? Is it a transliteration ??? It won't be done in three or two times. After learning about the parsing mechanism, you have to learn, communicate, and work hard with your own humility.

 

 


From the column of FishinLab

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.