In the past two days, I learned how to use an API of Microsoft. I wrote the interface code in Python and used the HTTP method for testing. The last response of the HTTP API is an XML file, seeing that the APIs of the XML elementtree interface in Python seem quite appetizing, I used elementtree for parsing.
However, the find and findall function interfaces can be used to obtain the desired tag,
The docamention of Python provides a simple explanation for these two functions.
- Find (
Match )
-
Finds the first subelement matchingMatch.MatchMay be a tag name or path. returns an element instance orNone.
Go home and find the Chinese translation "Python essential reference". The manual is too simple to explain, and the meaning of the Parameter Match is actually quite complicated.
It has tag, *. // tag, tag1/tag2, */tag, and other usage. The document does not describe it. I don't know if elementtree is a child of another family. I don't want to see it in my family, or because it is a document submitted by effbot, it's so sloppy.
Interested To Go To The http://effbot.org/zone/element.htm flip, than the docamention above clearly understand some.
Let's not talk about it. Use the code to explain the two functions.
xml_str="""
<a>
<b>1</b>
<b>2</b>
<c>
<d>
<b>3</b>
<b>4</b>
</d>
<e>
<b>5</b>
<b>6</b>
</e>
</c>
</a>
"""
tag = xml.etree.ElementTree.fromstring(xml_str)
print "find a-----------------------------------------------------"
Find_tag = tag. findall ("A") # You cannot findprint find_tag
print
Find_tag = tag. findall ("*") # Find B and C whose text is 1, 2.print find_tag
print "find b-----------------------------------------------------"
Find_tag = tag. findall ("B") # locate B whose text isprint find_tag
for item in find_tag:
print item,item.text
print
Find_tag = tag. findall (". // B") # Find B whose text is 1, 2, 4, 5, 6.print find_tag
for item in find_tag:
print item,item.text
print "find d-----------------------------------------------------"
Find_tag = tag. findall ("D") # It is not a subnode of A and cannot be searched.print find_tag
print
Find_tag = tag. findall ("C/D") # path to D, which does not include the current nodeprint find_tag
print
Find_tag = tag. findall (". // D") # Use the. // prefix to find all the lower nodes from the current node.print find_tag
print
print "find path . *-----------------------------------------------------"
Tag_c = tag. Find (". // C") # Start from C.Find_tag = tag_c.findall (". // B") # Find B at all levels under C and find B at levels of 3, 4, 5, and 6.print find_tag
for item in find_tag:
print item,item.text
Find_tag = tag_c.findall ("*/B") # Find all the data Whose tag is B in the lower layer of C and D, and find B whose text is 3, 4, 5, and 6.print find_tag
for item in find_tag:
print item,item.text
print "xml namespace -----------------------------------------------------"
xml_str="""<a xmlns="http://www.w3.org/TR/html4" >
<b>1</b>
</a>
"""
tag = xml.etree.ElementTree.fromstring(xml_str)
Find_tag = tag. findall ("*") # if there is an XML namespace, all tags have a namespace URI. The above tag string is {http://www.w3.org/tr/html4} B instead of Bprint find_tag
Unfortunately, I ran into these reefs.
[This can be fully reproduced when the author and the source are indicated. It cannot be used for profit or commercial purposes. Otherwise, each word is 1 RMB, and each figure is 100 RMB. No price reduction is required. For Baidu Library, 360doc price doubled]