How to read XML from a file
This example illustrates how to use the xmltextreader class to read XML from a file. This class provides direct analysis and tagging of XML, and implements W3C Extensible Markup Language (XML) 1.0 and namespace specifications in XML.
|
[Running example] | [View Source code] |
The xmlreader class provides APIs for XML analysis, and xmltextreader is designed to process byte streams.
In general, if you need to access XML as raw data, you can use xmltextreader to avoid Dom overhead. Saving Dom access can speed up XML reading. For example, an XML document may have a header section that is used to pass the document for processing elsewhere. Xmltextreader has different constructors to specify the location of XML data. This example loads XML from the books. xml file, as shown in the followingCode.
Xmltextreader reader = new xmltextreader ("books. xml "); Dim reader as xmltextreader = new xmltextreader ("books. xml ") |
C # |
VB |
|
After loading, xmltextreader uses the read method to move in XML data and retrieves the next record from the document in sequence. If no record exists, the read method returns false.
While (reader. Read () {// do some work here on the data ...} Do While (reader. Read () 'Do some work here on the data... loop |
C # |
VB |
|
To process XML data, each record has a node type that can be determined from the nodetype attribute. After the nodetype enumeration returns the node type, this example tests the node type to see whether it is an element type or a document type. If the node is either of the two types, this example uses the name and value attributes to process the node to display details about the node. The name attribute returns the node name (such as the element and attribute name), and the Value Attribute returns the node value (node text) of the current node (record ).
while (reader. read () {Switch (reader. nodetype) {Case xmlnodetype. element: // The node is an element console. write ("<" + reader. name); While (reader. movetonextattribute () // read attributes console. write ("" + reader. name + "= '" + reader. value + "'"); console. write (">"); break; Case xmlnodetype. documenttype: // The node is a documenttype console. writeline (nodetype + "<" + reader. name + ">" + reader. value); break ;...}}
Do While (reader. read () Select case reader. nodetype case xmlnodetype. element 'the node is an element console. write ("<" + reader. name) while (reader. movetonextattribute () 'read attributes console. write ("" + reader. name + "= '" + reader. value + "'") end while console. write (">") Case xmlnodetype. documenttype 'the node is a documenttype console. writeline (nodetype & "<" & reader. name & ">" & reader. value );... end select Loop
|
C # |
VB |
|
The returned xmlnodetype depends on the xmlreader class in use. For example, the xmltextreader class never returns the following types of xmlnodetype: Document, documentfragment, entity, endentity, and notation nodes. See the. NET Framework class library for details about the xmlnodetype returned by each xmlreader class.
The node type specified for xmlnodetype is equivalent to the W3C Dom node type and has some extension types required for reading only.
Xmlnodetype enumeration member |
Description |
Attribute |
Attribute. Example XML: Id = '000000 '. Attribute nodes can have the following subnode types: Text and entityreference. Attribute nodes seem to be different from any other node type subnodes. Note that it is not a subnode of an element. |
CDATA |
CDATA section. Example XML: <! [CDATA [My escaped text]> The CDATA section is used to escape text blocks. Otherwise, these text blocks will be recognized as marked. Cdatasection nodes cannot have any subnodes. The cdatasection node can be used as a child of documentfragment, entityreference, and element nodes. |
Comment |
Annotations. Example XML: <! -- My comment --> Comment nodes cannot have any subnodes. The comment node can be used as a child of the document, documentfragment, element, and entityreference nodes. |
Document |
As the root Document Object of the document tree, you can access the entire XML document. The document node can have the following subnode types: element (one at most), processinginstruction, comment, and documenttype. The document node cannot be a child of any node type. |
Documentfragment |
Document snippets. The documentfragment node associates a node or subtree with a document, which is not actually included in the document. The documentfragment node can have the following subnode types: element, processinginstruction, comment, text, cdatasection, and entityreference. A documentfragment node cannot be a child of any node type. |
Documenttype |
Document Type Declaration, by <! Doctype> mark. Example XML: <! Doctype...> The documenttype node can have the following subnode types: notation and entity. The documenttype node can be used as a child of the document node. |
Element |
Element. Example XML: <Name> Element nodes can have the following subnode types: element, text, comment, processinginstruction, CDATA, and entityreference. The element node can be used as a child of the document, documentfragment, entityreference, and element nodes. |
Endelement |
Returned when xmlreader reaches the end of an element. Example XML: </Foo> |
Endentity |
When xmlreader reaches the end of object replacement by calling resolveentity. |
Entity |
Object declaration. Example XML: <! Entity...> Entity nodes can have subnodes that represent extended entities (such as text nodes and entityreference nodes. The entity node can be used as a child of the documenttype node. |
Entityreference |
References to objects. Example XML: & Foo; It can be applied to all entities, including character entity references. Entityreference nodes can have the following subnode types: element, processinginstruction, comment, text, cdatasection, and entityreference. The entityreference node can be used as a child of the attribute, documentfragment, element, and entityreference nodes. |
None |
If the read method is not called, xmlreader returns the result. |
Notation |
The representation in the document type declaration. Example XML: <! Notation...> The notation node cannot have any subnodes. The notation node can be used as a child of the documenttype node. |
Processinginstruction |
Processing Command (PI ). Example XML: <? Pi test?> Pi nodes cannot have any subnodes. The PI node can be used as a child of the document, documentfragment, element, and entityreference nodes. |
Significantwhitespace |
Blank between tags in the mixed content model, or blank in the XML: Space = "preserve" range. |
Text |
The text content of the element. Text nodes cannot have any subnodes. The text node can be used as a child of attribute, documentfragment, element, and entityreference nodes. |
Whitespace |
Blank between tags. |
Xmldeclaration |
XML declaration node. Example XML: <? XML version = '1. 0'?> It must be the first node in the document. It cannot have sublevels. It is a child of the root node. It can have attributes that provide version and encoding information. |
The depth attribute returns the depth of the current node in the XML document, which is useful when formatting data. The depth of the node at the document root level is 0. This depth information is combined with the name and value Attributes. You can create an example that will process the XML file based on the node type and depth and format the output, collect statistics during reading. The following sample code shows how to use the format method to complete the basic formatting. To view the complete sample code, see view source files.
Private Static void format (xmlreader reader, string nodetype) {// format the output console. write (reader. depth + ""); console. write (reader. attributecount + ""); For (INT I = 0; I
" + reader. value); // display the attributes values for the current node if (reader. hasattributes) {console. write ("attributes:"); For (Int J = 0; j
private shared sub format (byref reader as xmltextreader, nodetype as string) 'format the output console. write (reader. depth & "") console. write (reader. attributecount & "") dim I as integer for I = 0 to reader. depth console. write (strings. CHR (9) Next console. write (nodetype & "<" & reader. name & ">" & reader. value) 'display the attributes values for the current node if (reader. hasattributes) console. write ("attributes:") dim J as integer for J = 0 to reader. attributecount-1 Console. write ("[{0}]" & reader (J), j) Next end if console. writeline () end sub
|
C # |
VB |
|
The preceding Code uses the hasattributes attribute to test whether an element node has any attribute node. Then, it uses the node index operator to retrieve each attribute value. This is similar to the Collection node attribute. In addition, the code example above uses the attributecount attribute to return the number of attributes of the current node. If you only care about attribute values rather than other attributes of the attribute node (such as the attribute name), you can use this method.
Note: For other methods to access the attribute by moving to each attribute node, and to read the node name and value, see how to read XML from the stream.
The following output is the result of running this example using the books. xml file. In this output, the first column is the depth attribute, and the second column is the attributecount attribute.
0 0 xmldeclaration <XML> Version = '1. 0 '0 0 comment <> This file represents a fragment of a book store inventory database0 0 element <bookstore> 1 3 element <book> attributes: [0] autobiography [1] 1981 [2] 1-861003-11-02 0 element <title> 3 0 text <> The Autobiography of Benjamin franklin2 0 element <author> 3 0 element <first-name> 4 0 text <> comment in3 0 element <last-name> 4 0 text <> franklin2 0 element <price> 3 0 text <> 8.991 3 element <book> attributes: [0] novel [1] 1967 [2] 0-201-63361-22 0 element <title> 3 0 text <> the confidence Man2 0 element <author> 3 0 element <first-name> 4 0 text <> herman3 0 element <last-name> 4 0 text <> melville2 0 element <price> 3 0 text <> 11.991 3 element <book> attributes: [0] philosophy [1] 1991 [2] 1-861001-57-62 0 element <title> 3 0 text <> the gorgias2 0 element <author> 3 0 element <name> 4 0 text <> plato2 0 element <price> 3 0 text <> 9.99 statistics for books. XML filexmldeclaration: 1 processinginstruction: 0 documenttype: 0 comment: 1 element: 18 attribute: 9 text: 11 whitespace: 27
Summary
- Xmltextreader provides fast, non-cache, and inbound access to XML data.
- Xmltextreader implements W3C Extensible Markup Language (XML) 1.0 and namespace specifications in XML.
- Xmltextreader provides constructors for reading XML from files, streams, or textreader.
- The read method moves the reader as a node in order.
- For element nodes, you can use the index operator to obtain attribute values.
- The attribute can be represented as the node list of the current node, and can be viewed through the hasattributes attribute.
- Depth attribute reports the depth of the current node, which can be used for formatting. The depth of the node at the root level is 0.