xml| Comparison | access | data | cursors
Ubiquitous XML
In addition to being able to represent structured and semi-structured data, XML has many other features that make it a widely used data representation format. XML is extensible, platform-independent, and supports internationalization due to its full adoption of Unicode. XML is a text-based format, so users can read and edit XML documents using standard text-editing tools as needed.
The extensibility of XML is manifested in several aspects. First, unlike HTML, XML does not have a fixed vocabulary. Instead, users can use XML to define a particular application or industry-specific glossary. Second, applications that process or use XML format are more "resistant" to changes in XML structures than applications that use other formats, as long as those changes are additional. For example, if an application is primarily handling a
In the process of exchanging documents, the XML schema can describe the conventions between the XML generator and the use program, because it describes the composition of valid XML messages between the two. Although there is a large number of architectural languages for XML, from DTDs to XDR, the most authoritative of the current XML Schema definition language is the common name XSD.
XSD is unique in the XML Schema language because it first attempts to extend the role of the XML schema so that it is no longer limited to the conventions used only to describe the two entity exchange documents. XSD introduces the concept of the post schema validation information set (Post Schema Validation INFOSET,PSVI). A complete XSD processor accepts an XML information set as input and converts it to a post-schema validation information set (PSVI) at validation time. PSVI is the initial input XML information set, with new information items added and new attributes added to existing information items. The XML schema recommendation for the consortium lists the components of the information set for post schema validation.
Type annotation is a very important class of PSVI components. Elements and attributes require strict type definitions and have data type information associated with them. XML with a strict type definition can be used to map to objects using techniques such as the XmlSerializer of the. NET Framework, which can be mapped to relational tables using the DataSet technology of the SQLXML and. NET Framework, or XML query languages with strict typing mechanisms, such as XPath 2.0 and XQuery, are processed.
The following example is a schema fragment that describes the items element of the sample document in the parsing section of the XML document.
<xs:schema xmlns:xs= "Http://www.w3.org/2001/XMLSchema" >
<xs:element name= "Items" >
<xs:complexType>
<xs:sequence>
<xs:element ref= "Compact-disc" minoccurs= "0" maxoccurs= "unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name= "Compact-disc" >
<xs:complexType>
<xs:sequence>
<xs:element name= "Price" type= "Xs:decimal"/>
<xs:element name= "artist" type= "Xs:string"/>
<xs:element name= "title" Type= "Xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Based on the tree model API
The tree model API renders an XML document as a tree of nodes, which can usually be loaded into memory immediately. The most common XML tree model API is the Document Object Model (DOM) of the consortium. The DOM supports programmatically reading, processing, and modifying XML documents.
The following example uses the XmlDocument class in the. NET Framework to get the artist name and title of the first compact-disc in the items element.
Using System;
Using System.Xml;
public class test{
public static void Main (string[] args) {
XmlDocument doc = new XmlDocument ();
Doc. Load ("Test.xml");
XmlElement FIRSTCD = (XmlElement) doc. Documentelement.firstchild;
XmlElement artist =
(XmlElement) Firstcd.getelementsbytagname ("artist") [0];
XmlElement title =
(XmlElement) Firstcd.getelementsbytagname ("title") [0]
Console.WriteLine ("Artist={0}, Title={1}", Artist. InnerText, title. InnerText);
}
}
Cursor-based APIs
The XML cursor API is like a lens moving through an XML document, aligning all aspects of the document being directed. The XPathNavigator class in the. NET Framework is an XML cursor API. The XML cursor API has the advantage of not having to load the entire document into memory, as compared to the tree model API, which makes it easy to optimize the part of the XML generator that is required to generate the document.
The following example uses the XPathNavigator class in the. NET Framework to get the artist name and title of the first compact-disc in the items element.
Using System;
Using System.Xml;
Using System.Xml.XPath;
public class test{
public static void Main (string[] args) {
XmlDocument doc = new XmlDocument ();
Doc. Load ("Test.xml");
XPathNavigator nav = doc. CreateNavigator ();
Nav. MoveToFirstChild (); Move from the root node to the document element (items)
Nav. MoveToFirstChild (); Move from the items element to the first COMPACT-DISC element
Move from Compact-disc element to artist element
Nav. MoveToFirstChild ();
Nav. MoveToNext ();
string artist = Nav. Value;
Move from artist element to title element
Nav. MoveToNext ();
string title = Nav. Value;
Console.WriteLine ("Artist={0}, Title={1}", Artist, Title);
}
}
Streaming API
When using a streaming API that processes XML, users can work with XML documents by simply storing the context of the current node to be processed in memory. Such APIs can handle large XML files without consuming a large amount of content space. There are two main types of streaming APIs for XML processing: A push-based XML parser and a pull-based XML parser.
A propulsion parser, such as SAX, works by moving through an XML data stream and "pushes" the event to a registered event handler (callback method) when it encounters an XML node. A pull-based parser, such as the XmlReader class in the. NET Framework, is used as a forward-only cursor in an XML data stream.
The following example uses the XmlReader class in the. NET Framework to get the artist name and title of the first compact-disc in the items element.
Using System;
Using System.Xml;
public class test{
public static void Main (string[] args) {
string artist = null, title = NULL;
XmlTextReader reader = new XmlTextReader ("Test.xml");
Reader. MoveToContent (); Move from root node to document element (items)
/* Keep reading until you get the first <artist> element * *
while (reader. Read ()) {
if (reader. NodeType = = xmlnodetype.element) && reader. Name.equals ("artist")) {
Artist = reader. Readelementstring ();
title = Reader. Readelementstring ();
Break
}
}
Console.WriteLine ("Artist={0}, Title={1}", Artist, Title);
}
}