Best practices for representation of XML in. NET Framework released on: | Updated on: 8/25/2004
Dare Obasanjo
Microsoft Corporation
Summary: Dare Obasanjo focuses on options that can be used to represent shared and XML-based data between components in a single process and appdomain, and discusses the advantages and disadvantages of each method in design.
Introduction
After a recent design review, a project manager's colleague asked if there was a design criterion when publishing XML in the API, because he had seen many different methods, however, you cannot determine the method to be selected. I told him that I believed I could find some standards on msdn, But I found only one msdn TV titled passing XML data inside the CLR, although such information is contained, it is not easy to read. So I came up with the idea of providing a printable version of the Don box msdn TV fragment and some of my experience in Microsoft's XML API processing.
Back to Top
Preliminary Study on principles
In three main cases, developers need to consider the APIs used to represent XML. The following briefly introduces these situations and guidelines:
• |
The fields or attributes of the class includeXML: If the field or attribute of a class is an XML document or segment, the class should provide its attribute as a string andXmlreaderOperation mechanism. |
• |
Acceptable MethodXMLInput or returnXMLAs output: Methods that can accept or return XML should be helpful for returningXmlreaderOrXpathnavigatorUnless you want to be able to edit XML data (you should useXmldocument). |
• |
Convert an objectXML: If an object needs to express itself in XML for serialization purposes, when it needs to obtain the XML serialization process control ratioXmlserializerUseXmlwriter. If an object is to represent itself in XML, so that it can be fully involved as a member of the XML world (for example, XPath query or XSLT conversion on this object ), then this object should be implementedIxpathnavigableInterface. |
In the following sections, I will introduce in detail the situations mentioned above and explain how I have obtained these principles.
Back to Top
FAQ
When you decide to use methods or attributes to accept or return XML, many classes in. NET Framework will appear in your brain, which are applicable to this task. The following lists the five most suitable classes for representation of XML input or output in. NET Framework, and provides a brief description of their positive and negative aspects.
1. |
System. xml. xmlreader (English ):XmlreaderIs the xml analysis of the pull model of. NET Framework.Program. During model pulling, XML users control program streams by requesting events from the XML producer as needed. Pull model XML Analysis Program (for exampleXmlreader. In fact,XmlreaderYou do not need to load the entire XML file to the memory and it is read-only. This makes it a good choice to create XML appearances from non-XML data sources. Xmlcsvreader is an example of how to create an XML file from a non-XML data source ). Someone mayXmlreaderAs a limitation, because it makes it impossible for users to pass multiple times through each part of the XML document. |
2. |
system. xml. XPath. xpathnavigator (English): xpathnavigator is the read-only cursor on the XML data source. The XML cursor is like a lens, focusing on an XML node at a time, but different from the pull-based API (such as xmlreader , it can position the cursor anywhere in the XML document at any given time. To some extent, the PULL model API is the only version of the cursor model. xpathnavigator is also a good candidate for XML appearance with non-XML data, because it allows you to construct an XML view of the data source in real time, instead of converting the entire data source into an XML tree. An example of creating an XML view of non-XML data using xpathnavigator is objectxpathnavigator ). In fact, xpathnavigator is read-only and lacks some user-friendly attributes (such as innerxml and outerxml )), this makes it better to use xmldocument or xmlnode when you need these functions. |
3. |
System. xml. xmlwriter (English ):XmlwriterProvides a general mechanism to push XML documents into the basic storage area. The basic storage area can be a slave file (if you are usingXmltextwriter)Xmldocument(If xmlnodewriter (English) is used. UseXmlwriterParameters Used as the method for returning XML are various possible return types (including file streams, strings, andXmldocumentInstance) provides a reliable method. This is why xmlserializer. serialize () method (English) acceptsXmlwriterAs an overload parameter. As the name implies,XmlwriterIt is only useful for writing XML and cannot be used to read or process XML. |
4. |
System. xml. xmldocument/xmlnode ):XmldocumentIs the W3C Document Object Model (DOM) implementation. Dom is composedXmlnodeAn XML document composed of an object's hierarchy tree is represented in memory. These objects represent the logical components of the XML document, such as elements, attributes, and text nodes. Dom is the most popular API for operating XML in. NET Framework, because it provides a direct method to load, process, and save XML documents. The main drawback of Dom is that it needs to load the entire XML file into the memory. |
5. |
System. String (English): XML is a text-based formatStringClass can better represent text. The main advantage of using a string as an XML Representation method is that the string is the minimum denominator. Strings are easy to write to log files or print to the console. If you want to use XML APIs for actual processing, you can load stringsXmldocumentOrXpathdocument. There are several problems with using strings as the main input or output of methods or attributes using XML. The first problem with using strings is similar to Dom. They all need to load the entire XML file into the memory. Second, using strings to indicate XML increases the burden on the manufacturer to generate xml strings, which may be very troublesome in some cases. An example of generating an XML string may be very troublesome:XmlreaderOrXpathnavigatorWhen obtaining XML. Third, representation of XML in strings may lead to confusion related to character encoding, because the strings in. NET Framework are always UTF-16 character encoding no matter what encoding declaration is put in the XML. Finally, it is difficult to create an XML processing pipeline by using a string representation of XML, because each layer of the pipeline must re-analyze the document. |
Back to Top
The field or attribute of the class contains XML
In some cases, the field or attribute of an object may be an XML document or XML segment. The following example class represents an email with the content of XHTML. The XML content of an email is represented by a string andBodyAttribute exposure:
Public class email {private string from; Public String from {get {return from;} set {from = value ;}} private string to; Public string to {get {return ;} set {to = value ;}} private string subject; Public String subject {get {return subject;} set {subject = value ;}} private datetime sent; public datetime sent {get {return sent;} set {sent = value ;}} private xmldocument body = new xmldocument (); Public String body {get {return body. outerxml;} set {body. load (new system. io. stringreader (value ));}}}
String is the most user-friendly representation of fields or attributes in XML documents.System. StringClass is a common question in XML ". However, this will increase the burden on users who use this class. These users may now have to cope with the cost of analyzing XML documents twice. For example, suppose that an attribute is set in the XML obtained from sqlcommand. executexmlreader () method (English) or xsltransform. Transform () method (English. In this case, you must analyze the document twice, as shown in the following example:
Email = new email (); email. from = "dareo@example.com"; email. to = "michealb@example.org"; email. subject = "Hello World"; transform = new transform (); transform. load ("format-body.xsl"); xmldocument body = new xmldocument (); // 1. XML is composed of xmldocument. load () analyzes the body. load (transform. transform (New xpathdocument ("body. XML "), null); // 2. the same XML is emailed. xmldocument. load () analyze email again. body = body. outerxml;
In the preceding example, the same XML is analyzed twice because it must be retrieved fromXmlreaderLoadXmldocumentAnd then use the XML settings.EmailClassBodyAttribute, and the XML is analyzed internallyXmldocument. An effective rule of thumb is to provide access to XML attributes (XmlreaderAnd string ). Therefore, add the following methodsEmailClass to maximize the flexibility of the class users:
Public void setbody (xmlreader reader) {body. Load (Reader);} public xmlreader getbody () {return New xmlnodereader (body );}
This isEmailClass users provide a way for these users to pass, set, and retrieve in an effective way as neededBodyXML data in the property.
CriteriaIf the field or attribute of a class is an XML document or segment, the class should provide a mechanism to operate its attribute as a string and xmlreader at the same time.
A keen reader may notice that if you directlyXmldocumentThis rule should be met when the attribute is made public, and users of this attribute should be allowed to modify XML accurately.
Back to Top
The method can accept XML input or return XML as the output.
When designing methods for generating or using XML, developers have the responsibility to make such methods flexible when receiving input. When the method accepts XML as the input, you can divide these methods into methods that require data modification in the appropriate location, and only need to read-only access the XML. The only "xml faq" that supports read/write access isXmldocument. The followingCodeThe example shows the following method:
Public void applydiscount (xmldocument pricelist) {foreach (xmlelement price in pricelist. selectnodes ("// price") {price. innertext = (double. parse (price. innertext) * 0.85 ). tostring ();}}
There are two main options for the method that requires read-only access to XML:
• |
Xmlreader |
• |
Xpathnavigator |
XmlreaderProvides inbound access to XML, whileXpathnavigatorIt not only provides random access to the basic XML source, but also provides the ability to execute XPath queries on the data source. The following code example prints the artist and title in the following XML document ):
<Items> <compact-disc> <price> 16.95 </price> <artist> Nelly </artist> <title> nellyville </title> </compact-disc> <compact -Disc> <price> 17.55 </price> <artist> Baby D </artist> <title> lil chopper toy </title> </compact-disc> </items>
Xmlreader:
Public static void printartistandprice (xmlreader reader) {reader. movetocontent (); // move from root node to document element (items)/* keep read until the first <artist> element */while (reader. read () {If (reader. nodetype = xmlnodetype. element) & reader. name. equals ("artist") {artist = reader. readelementstring (); Title = reader. readelementstring (); break;} console. writeline ("artist = {0}, Title = {1}", artist, title );}}
Xpathnavigator:
Public static void printartistandprice (xpathnavigator NAV) {xpathnodeiterator iterator = nav. select ("/items/compact-disc [1]/artist |/items/compact-disc [1]/Title"); iterator. movenext (); console. writeline ("artist = {0}", iterator. current); iterator. movenext (); console. writeline ("Title = {0}", iterator. current );}
Generally, the methods that return XML use similar rules. If you want the receiver to edit XML, you should returnXmldocument. Otherwise, only stream access to XML data should be provided based on whether the method needs to be returned.XmlreaderOrXpathnavigator.
CriteriaMethods that accept or return XML should be helpful for returningXmlreaderOrXpathnavigatorUnless you want to be able to edit XML data (you should useXmldocument).
The above guidelines mean that the method for returning XML should help to returnXmlreaderBecause it applies to more users than any other type. In addition, when the method caller needs more functions, they canXmlreaderLoadXmldocumentOrXpathdocument.
Back to Top
Convert an object to XML
XML, as a common language for information exchange, is ubiquitous, making it an obvious choice for expressing some of its own objects in XML, which is for serialization purposes, or to obtain access to other XML technologies (such as querying using XPath or converting using XSLT ).
When an object is converted to XML for serialization purposes, it is clear that XML serialization Technology in the. NET Framework should be used ). However, in some cases, you may need to controlXmlserializerMore. In this caseXmlwriterIs a very useful class, because it makes it unnecessary for you to have a one-to-one ing between the structure of the class and the generated XML. The following example shows how to useXmlwriterThe XML generated by serializing the email class (as mentioned in the previous sections.
Public void save (xmlwriter writer) {writer. writestartdocument (); writer. writestartelement ("email"); writer. writestartelement ("headers"); writer. writestartelement ("Header"); writer. writeelementstring ("name", "to"); writer. writeelementstring ("value", this. to); writer. writeendelement (); // The title writer. writestartelement ("Header"); writer. writeelementstring ("name", "from"); writer. writeelementstring ("value", this. from); writer. writeendelement (); // The title writer. writestartelement ("Header"); writer. writeelementstring ("name", "subject"); writer. writeelementstring ("value", this. subject); writer. writeendelement (); // The title writer. writestartelement ("Header"); writer. writeelementstring ("name", "sent"); writer. writeelementstring ("value", xmlconvert. tostring (this. sent); writer. writeendelement (); // The title writer. writeendelement (); // Title; writer. writestartelement ("body"); writer. writeraw (this. body); writer. writeenddocument (); // close all opened tags}
This code generates the following XML document
<Email>
Use onlyXmlserializerIt is impossible to generate the above XML document.XmlwriterAnother advantage is that it can be extracted from the basic target to the target of the data to be written, so it can be from the files on the disk to the strings in the memory (or even xmlnodewriter (English)XmldocumentTitle.
If you want to provide a way to make the class more fully involved in the XML world (such as interacting with XML technologies such as XPath or XSLT), the best choice for this class is to implementIxpathnavigableAnd provideXpathnavigator. In this example, objectxpathnavigator provides an XML view for any object that enables you to execute XPath queries or run XSLT transformations on the above objects.
CriteriaIf the object needs to express itself in XML for serialization purposes, when it needs to obtain the XML serialization process control ratioXmlserializerUseXmlwriter. If an object is to represent itself in XML, so that it can be fully involved as a member of the XML world (for example, XPath query or XSLT conversion on this object ), then this object should be implementedIxpathnavigableInterface.
Back to Top
Conclusion
In future. NET Framework versions, more emphasis will be placed on the cursor-based xml api (for exampleIxpathnavigableAPI exposedXpathnavigator). This type of cursor will become the main mechanism for interaction with XML in. NET Framework.
Dare ObasanjoIs a member of the Microsoft webdata team. In addition to other transactions, the team also developed.. NET Framework system. XML and system. data namespace, Microsoft XML Core Service (MSXML), and Microsoft Data Access Component (MDAC) components.