How do I handle whitespace characters in the XML object model? Sometimes, the XML object model displays a TEXT node that contains white space characters. When whitespace characters are truncated, some confusion is likely to result. For example, the following XML example: ]> Smith John
The following trees are generated: Processing Instruction:xml Doctype:person Element:person TEXT: Element:lastname TEXT: Element:firstname TEXT:
Both the first and last names are text nodes that contain only whitespace characters, because the content model of the "person" element is MIXED; it contains #PCDATA keywords. The MIXED content model specifies that there can be text between elements. Therefore, the following are also true:
My last name is Smith and my A-name is John
The result is similar to the following tree:
Element:person Text:my Last Name is Element:lastname Text:and my The name is Element:firstname TEXT:
A sentence cannot be understood without the word "is" and the preceding white-space character, and then the white space before the word "and". Therefore, for the MIXED content model, text combinations, whitespace characters, and elements are related. This is not the case for a MIXED content model.
To make the TEXT node with only whitespace characters disappear, delete the #PCDATA keyword from the person element declaration: The result is a clear tree below: Processing Instruction:xml Doctype:person Element:person Element:lastname Element:firstname
What does an XML declaration do? The XML declaration must be listed at the top of the XML document: It specifies the following items: The document is an XML document. A MIME probe can use it to detect whether a file is a type text/xml when it is missing or has not yet specified a MIME type. The document conforms to the XML 1.0 specification. This is important in the future when there are other versions of XML. Document character encoding. The encoding attribute is optional and the default is UTF-8. Note: The XML declaration must be in the first row of the XML document, so the following XML file: The following parse error was generated: Invalid XML declaration. Line 0000002: Location 0000007:------^ Note: XML declarations are optional. If you need to specify comments or processing instructions at the top, do not put them in the XML declaration. However, the default encoding will be UTF-8. How do I print my XML document in a readable format? When you construct a document from scratch with the DOM to produce an XML file, everything is on one line and there is no space between each other. This is the default behavior. Constructs the default XSL style sheet in Internet Explorer 5 to display and print XML documents in a readable format. For example, if you already have IE5 installed, try viewing the Nospace.xml file. The following tree should be displayed in the browser : - - Xyz 12.56 No white space characters are inserted in the XML. It is interesting to print readable XML, especially when you have DTDs that define different types of content models. For example, you cannot insert a space under a mixed content model (#PCDATA) because it may change the meaning of the content. For example, consider the following XML: Elephant This is best not output as: E Lephant Because the word boundaries are no longer correct. All of this makes automated printing a problem. If you do not need to print readable XML, you can use the DOM to insert whitespace characters as text nodes in the appropriate place. How do I use namespaces in a DTD? To use a namespace in a DTD, declare it in the attlist declaration of the element that uses it, as follows: The namespace type must be #FIXED. The same is true for the namespace of the property: Namespaces and XML schemas DTDs and XML schemas cannot be mixed. For example, the following Xmlns:x CDATA #FIXED "X-schema:myschema.xml"
Will not result in the use of schema definitions defined in Myschema.xml. The use of DTDs and XML schemas is mutually exclusive.
How do I use XMLDSO in Visual Basic? Use the following XML as an example: Mark Hanson 206 765 4583 Jane Smith 425 808 1111
You can bind to an ADO recordset as follows:
Create a new VB 6.0 project. Add references to Microsoft ActiveX Data Objects 2.1 or later, Microsoft Data Adapter Library, and Microsoft XML version 2.0. Load XML data into an XML DSO control with the following code: Dim DSO as New Xmldsocontrol Dim Doc as IXMLDOMDocument Set doc = DSO. XmlDocument Doc. Load ("D:\test.xml")
Use the following code to map the DSO to a new Recordset object that uses DataAdapter: Dim da as New DataAdapter Set da. Object = DSO Dim rs as New ADODB. Recordset Set Rs. DataSource = da
Access data:
MsgBox Rs. Fields ("name"). Value
The result shows the string "Mark Hanson" How do I use the XML DOM in Java? MSXML must already be installed. The IE5 version of the DLL. In Visual J + + 6.0, select Add COM wrapper from the Project menu and select Microsoft XML 1.0 from the list of COM objects. This will construct the required Java wrapper into a new package called "MSXML." These pre constructed Java wrappers can also be downloaded. Classes can be used in the following ways: Import com.ms.com.*; Import msxml.*; public class Class1 { public static void Main (string[] args) { DOMDocument doc = new DOMDocument (); Doc.load (New Variant ("file://d:/samples/ot.xml")); System.out.println ("Loaded" + doc.getdocumentelement (). Getnodename ()); } }
The code example loads the 3.8MB test file "Ot.xml" from the Sun Religion sample. Variant classes are used to wrap Win32 variant base types. Because a new wrapper is actually obtained each time the node is retrieved, pointer comparisons cannot be used on the node. Therefore, do not use the following code, IXMLDOMNode root1 = Doc.getdocumentelement (); IXMLDOMNode Root2 = Doc.getdocumentelement (); if (root1 = = Root2) ... Instead, use the following code: if (Comlib.isequalunknown (ROOT1, Root2)) .... The total size of the class wrapper is approximately 160KB. However, in order to be fully compliant with the ixmldom*, you should only use the packaging program. The following classes are old IE 4.0 XML interfaces that can be removed from the MSXML folder: ixmlattribute*, ixmldocument*, xmldocument* ixmlelement*, ixmlerror*, ixmlelementcollection*, tagxmlemem_type* _xml_error* This reduces the size to 147KB. You can also delete the following items: Domfreethreadeddocument Access XML documents from multiple threads in a Java application. XMLHttpRequest Communicates with the server using an XML DAV HTTP extension. Ixtlruntime Defines an XSL style sheet script object. Xmldsocontrol Bind to XML data in an HTML page. Xmldomdocumentevents Returns the callback during parsing.
This can reduce the size to 116KB. To make it smaller, consider the fact that the DOM itself has two layers: the core layer includes: DOMDocument, IXMLDOMDocument ixmldomnode* ixmldomnodelist* ixmldomnamednodemap* ixmldomdocumentfragment* Ixmldomimplementation Ixmldomparseerror And the DTD information that the user may need to keep:
Ixmldomdocumenttype Ixmldomentity Ixmldomnotation
All node types in an XML document are IXMLDOMNode, which provides full functionality, but a higher level wrapper for each node type. Therefore, if you modify the DOMDocument wrapper and change these specific types to use IXMLDOMNode, then all of the following interfaces can be deleted:
Ixmldomattribute Ixmldomcdatasection Ixmldomcharacterdata Ixmldomcomment IXMLDOMElement Ixmldomprocessinginstruction Ixmldomentityreference Ixmldomtext
Removing these will reduce the size to 61KB. However, for IXMLDOMElement, the GetAttribute and SetAttribute methods are useful. Otherwise you need to use: Ixmldomnode.getattributes (). SetNamedItem (...) |