Parse XML text using MSXML (1)

Source: Internet
Author: User
Tags xml parser

Parse XML text using MSXML (1)
The xml dom (Document Object Model) object provides a standard method to operate information stored in the XML document. This is the DOM application programming interface (API) function. It is an applicationProgramAnd XML documents. Dom contains two key abstract concepts: one is a tree hierarchy, and the other is a node set used to represent the content and structure of the document. The tree hierarchy includes all nodes, and the node itself can also contain other nodes. The advantage is that you can use this hierarchy to find and modify the information of a specific node.

---- Microsoft's MSXML parser reads an XML document and parses its content into an abstract information container, which is called a node (nodes ). These nodes represent the structure and content of the document and allow applications to operateCompositionWithout the need to know the meaning of XML. After a document is parsed, its nodes can be browsed at any time without a certain order.

---- For developers, the most important programming object is domdocument. The domdocument object allows you to browse, query, and modify the content and structure of an XML document by exposing attributes and methods.

---- This article mainly introduces the structure and application of Dom.Programming LanguageAn example of XML parsing through MSXML is provided.

Structure and application of the domdocument object
---- Document Object Creation
Hresult hr;
Ixmldomdocument * pxmldoc;
Ixmldomnode * pxdn;
// Initialize com
HR = coinitialize (null );
/* Get ixmldomdocument
Interface pointer pxmldoc */
HR = cocreateinstance (clsid_domdocument, null,
Clsctx_inpproc_server, iid_ixmldomdocument,
(Void **) & pxmldoc );
// Obtain the ixmldomnode interface pointer pxdn.
HR = pxmldoc-> QueryInterface (iid_ixmldom
Node, (void **) & pxdn );

---- When using the MSXML parser, we can use the createelement method in the document to create a node to load and save the XML file, you can also load an XML document from a specified URL through the load or loadxml method. The load (loadxml) method has two parameters: the first parameter xmlsource indicates the document to be parsed, and the second parameter issuccessful indicates whether the document is loaded successfully.

---- Saving document objects

---- The Save method is used to save the document to a specified location. The destination parameter in the SAVE method is used to indicate the type of the object to be saved. The object can be a file, an ASP response method, an XML document object, or a persistence object). The following is an example of a program using the Save method.Code:

Bool domdocsavelocation ()
{
Bool bresult = false;
Ixmldomdocument * pixmldomdocument = NULL;
Hresult hr;
Try
{
_ Variant_t varstring = _ T ("D: \ sample. xml ");
/* Create a domdocument
Object and code for loading XML documents */
// Save the document to D: \ sample. xml
HR = pixmldomdocument-> Save (varstring );
If (succeeded (HR ))
Bresult = true;
}
Catch (...)
{
Displayerrortouser ();
/* Release the ixmldomdocument
Code referenced by the API */
}
Return bresult;
}

---- Set resolution flag

---- In the parsing process, we need to get and set the parsing mark. Using different parsing marks, we can use different methods to parse an XML document. The XML standard allows the parser to verify or not verify the document, and does not allow the parsing process of the document to skip the extraction of external resources, you can also set a flag to indicate whether to remove unnecessary spaces from the document. The domdocument object exposes the following attributes, allowing you to use them to change the parser behavior during running.

Async attribute Methods: get_async and put_async.
The validateonparse attribute methods are get_validateonparse and put_validateonparse.
The resolveexternals attribute methods are get _ resolveexternals and put _ resolveexternals.
Preservewhitespace attribute Methods: Get _ preservewhitespace and put _ preservewhitespace.
---- Each attribute can accept or return a Boolean value. By default, the values of async, validateonparse, and resolveexternals are true, and the values of preservewhitespace are related to the settings in the XML document. If the XML: space attribute is set in the XML document, the value is false.

---- The following information can be collected during document parsing:

Doctype: a dtd file used to define the document format. If the XML document does not have a related DTD document, it returns NULL.
Implementation (Implementation): indicates the implementation of this document, which indicates the XML version supported by the current document.
Parseerror: indicates the most recent error in the parsing process.
Readystate: indicates the status information of the XML document. Readystate improves performance for asynchronous use of Microsoft's XML parser. When an XML file is asynchronously loaded, the program may need to check the resolution status. MSXML provides four statuses: in-progress, in-progress, and in-progress.
URL: indicates the URL of the XML document being loaded and parsed. If this document is created in the memory, this property returns a null value.
---- Node operations

---- After obtaining the document tree structure, we can operate on each node in the tree. Generally, we can use two methods to obtain the nodes in the tree: nodefromid and getelementsbytagname.

---- Nodefromid includes two parameters. The first parameter idstring is used to represent the id value, and the second parameter node returns the interface pointer to the node that matches the ID. According to the XML technical regulations, the ID value in each XML document must be unique, and one element can only be associated with one ID.

---- The getelementsbytagname method has two parameters. The tagname of the first parameter indicates the name of the element to be searched. If the tagname is "*", all elements in the document are returned. The second parameter is resultlist, which is actually a pointer to the ixmldomnodelist interface to return a set of all nodes related to the tagname (Tag name.

---- The following is part of the code of the relevant example program:

Ixmldomdocument * pixmldomdocument = NULL;
Wstring strfindtext (_ T ("author "));
Ixmldomnodelist * pidomnodelist = NULL;
Ixmldomnode * pidomnode = NULL;
Long value;
BSTR bstritemtext;
Hresult hr;
Try
{
/* Create a domdocument
Document Object and code for loading specific documents */
/* The following code is used to obtain a set of all nodes related to the label name author */
// Whether the pointer to idomnodelist is obtained correctly
HR = pixmldomdocument-> getelementsbytagname
(Tchar *) strfindtext. Data (), & pidomnodelist );
Succeeded (HR )? 0: Throw hr;
// Obtain the number of included nodes
HR = pidomnodelist-> get_length (& value );
If (succeeded (HR ))
{
Pidomnodelist-> Reset ();
For (int ii = 0; II <value; II ++)
{
// Obtain a specific node
Pidomnodelist-> get_item (II, & pidom
Node );
If (pidomnode)
{
// Obtain the node-related text information
Pidomnode-> get_text (& bstritemtext );
: MessageBox (null, bstritemtext,
Strfindtext. Data (), mb_ OK );
Pidomnode-> release ();
Pidomnode = NULL;
}
}
}
Pidomnodelist-> release ();
Pidomnodelist = NULL;
}
Catch (...)
{
If (pidomnodelist)
Pidomnodelist-> release ();
If (pidomnode)
Pidomnode-> release ();
Displayerrortouser ();
}

---- You can use createnode to create a new node. Createnode includes four parameters. The first parameter type indicates the type of the node to be created, and the second parameter name indicates the nodename value of the new node, the third parameter namespaceuri represents the namespace related to the node, and the fourth parameter node represents the newly created node. You can create a new node by using the provided type, name, and nodename.

---- When a node is created, it is actually created within a namespace range (if a namespace is provided. If no namespace is provided, it is actually created within the document namespace range.

Parse XML
---- To illustrate how to use the xml dom model in VC, we will introduce a simple console application instance program. The following is the main program code used to locate a special node in an XML document and insert a new subnode.
# Include
/* The. h file below is installed with the latest
. H file contained after XML Parser */
# Include "c: \ Program Files \ Microsoft
XML Parser SDK \ Inc \ msxml2.h"
# Include
Void main ()
{
// Initialize the COM interface
Coinitialize (null );
/* In the program, assume that the loaded XML file name is
Xmldata. XML, which is the same as the executable file by default
A directory. The content of this file is as follows:



Hello, world!

---- The program will look for a node named "xmlnode", insert a new node named "xmlchildnode", and then search for a node named "xmltext, then extract the text contained in the node and display it. Finally, it saves the new modified XML document in

"Updatexml. xml" document. */
Try {
// Use a smart pointer to create an instance of the parser
Ccomptrspxmldom;
Hresult hR = spxmldom. cocreateinstance
(-Uuidof (domdocument ));
If (failed (HR) throw "XML Parser object cannot be created ";
If (spxmldom. P = NULL) Throw
"Cannot create XML Parser object ";

// Create a file and start loading the XML file
Variant_bool bsuccess = false;
HR = spxmldom-> load (ccomvariant (
L "xmldata. xml"), & bsuccess );
If (failed (HR) Throw
"XML documents cannot be loaded in the parser ";
If (! Bsuccess) Throw
"XML documents cannot be loaded in the parser ";
// Check and search for "xmldata/xmlnode"
Ccombstr bstrss (L "xmldata/xmlnode ");
Ccomptrspxmlnode;
/* Use the ixmldomdocument Interface
Selectsinglenode method to locate the node. */
HR = spxmldom-> selectsinglenode
(Bstrss, & spxmlnode );
If (failed (HR) Throw
"You cannot locate 'xmlnode' in an XML node '";
If (spxmlnode. P = NULL) Throw
"You cannot locate 'xmlnode' in an XML node '";
/* DOM object "spxmlnode"
Now contains XML nodes,
Therefore, we can create a subnode under it. */
Ccomptr spxmlchildnode;
/* Create using the ixmldomdocument Method
Create a new node. */
HR = spxmldom-> createnode (
Ccomvariant (node_element ),
Ccombstr ("xmlchildnode "),
Null, & spxmlchildnode );
If (failed (HR) throw "cannot be created
'Xmlchildnode' node ";
If (spxmlchildnode. P = NULL)
Throw "cannot create 'xmlchildnode' node ";
// Add a new node to the spxmlnode Node
Ccomptr spinsertednode;
HR = spxmlnode-> appendchild
(Spxmlchildnode, & spinsertednode );
If (failed (HR) Throw
"Cannot create 'xmlchildnode' node ";
If (spinsertednode. P = NULL) Throw
"You cannot move the 'xmlchildnode' node ";
// Set new node attributes
Ccomqiptr spxmlchildelement;
Spxmlchildelement = spinsertednode;
If (spxmlchildelement. P = NULL)
Throw "cannot be queried in the XML Element Interface
'Xmlchildnode '";
HR = spxmlchildelement-> setattribute
(Ccombstr (L "XML"), ccomvariant (L "fun "));
If (failed (HR) throw "new attributes cannot be inserted ";
/* The following program section is used to find a node.
The node information is displayed. */
// Search for the "xmldata/xmltext" Node
// Release the previous Node
Spxmlnode = NULL;
Bstrss = l "xmldata/xmltext ";
HR = spxmldom-> selectsinglenode
(Bstrss, & spxmlnode );
If (failed (HR) throw "cannot be located
'Xmltext' node ";
If (spxmlnode. P = NULL) Throw
"The 'xmltext' node cannot be located ";
// Obtain the text contained in the node and display it
Ccomvariant varvalue (vt_empty );
HR = spxmlnode-> get_nodetypedvalue
(& Varvalue );
If (failed (HR) throw "cannot extract 'xmltext' text ";
If (varvalue. Vt = vt_bstr ){
/* Display the result. Note that the string
Converts the form BSTR to ANSI. */
Uses_conversion;
Lptstr lpstrmsg = w2t (varvalue. bstrval );
STD: cout <lpstrmsg <STD: Endl;
} // If
Else {
// If an error occurs
Throw "cannot extract 'xmltext' text ";
} // Else
// Save the modified XML document according to the specified document name
HR = spxmldom-> Save (ccomvariant
("Updatedxml. xml "));
If (failed (HR) Throw
"The modified XML document cannot be saved ";
STD: cout <"processing completed..." <STD: Endl;
} // Try
Catch (char * lpstrerr ){
// Error
STD: cout <lpstrerr <STD: Endl;
} // Catch
Catch (...){
// Unknown error
STD: cout <"Unknown error..." <STD: Endl;
} // Catch
// End the use of COM
Couninitialize ();
}

---- Because XML documents have more strict syntax requirements than HTML, it is easier to use and compile an XML Parser than to compile an HTML Parser. At the same time, Because XML documents not only mark the display attributes of documents, but more importantly, they mark the structure and features of documents that contain information, therefore, we can easily use the XML parser to obtain information about specific nodes and display or modify them.

Below is the XML format of storage for a program I recently developed: memory and read
The Dom used to read data from sax2.0 during storage
Void csysdrawdoc: createxmldoc (cfile * pfile)
{
Hresult hr;
HR = coinitialize (null );
Uses_conversion;
Isaxdtdhandlerptr DTD = NULL;
HR = PWR. createinstance (_ uuidof (mxxmlwriter40 ));
DTD = PWR;
PWR-> indent = variant_true;
PWR-> omitxmldeclaration = variant_true;

Isaxcontenthandlerptr spcontenthandler = NULL;
Spcontenthandler = PWR;

// Start the document by adding the XML declaration.
HR = spcontenthandler-> startdocument ();
Char szpi [] = "version = '1. 0' encoding = 'gb2312' standalone = 'No'"; // iso-8859-1 encoding = 'utf-16'

// Step 1: Create the XML declaration.
HR = spcontenthandler-> processinginstruction (L "XML", 3, a2w (szpi), strlen (szpi); // XML version Declaration
// No DTD verification is added
// Isaxlexicalhandlerptr lexh;
// Lexh = PWR;
// Lexh-> startdtd (L "SVG", 3, l "-// W3C // dtd svg 20001102 // en", 28, l "http://www.w3.org/TR/2000/CR-SVG-20001102.dtd", 45 );
// Lexh-> enddtd ();
// Step 2: Create the root --- SVG element.
Imxattributesptr pmxattr = NULL;
Isaxattributesptr spattributes = NULL;
HR = pmxattr. createinstance (_ uuidof (saxattributes40 ));
Pmxattr-> addattribute (L "", l "width", l "width", "float", "3200 ");
Pmxattr-> addattribute (L "", l "height", l "height", "float", "2400 ");
Pmxattr-> addattribute (L "", l "xmlns", l "xmlns", "string", "http://www.w3.org/2000/svg ");
Spattributes = pmxattr;
HR = spcontenthandler-> startelement (L "", 0, l "SVG", 3, l "SVG", 3, spattributes );
// Step 3: Add subclass Elements
Int num = 0;
For (position Pos = m_objects.getheadposition (); pos! = NULL ;)
{
Cdrawobj * pobj = m_objects.getnext (POS );
// If ((! Pobj-> iskindof (runtime_class (cdrawe_label )))&&(! Pobj-> iskindof (runtime_class (cdrawe_node ))))
{
If (! Pobj-> m_belemgroup)
{
Pobj-> writecont (PWR );
Variant vttemp = PWR-> output;
Cstring STR = vttemp. bstrval;
Lptstr ST = Str. lockbuffer ();
Int Len = Str. getlength ();
Pfile-> write (St, Len );
Vttemp. Vt = vt_empty;
PWR-> output = vttemp;
}
}
}
// End the root element and the document
HR = spcontenthandler-> endelement (L "", 0, l "SVG", 3, l "SVG", 3 );
HR = spcontenthandler-> enddocument ();

Variant vttemp = PWR-> output;
Cstring STR = "";
STR = vttemp. bstrval;
Lptstr ST = Str. lockbuffer ();
Int Len = Str. getlength ();
Pfile-> write (St, Len );
Vttemp. Vt = vt_empty;
PWR-> output = vttemp;
}

Void csysdrawdoc: readdomdoc (char * P)
{
Csysdrawapp * PAPP = (csysdrawapp *) afxgetapp ();
Hresult hr;
Coinitialize (null );
HR = pxmldoc. createinstance (_ uuidof (domdocument40 ));
If (failed (HR ))
Return;
Cstring m_sys = P;
_ Variant_t varout (bool) True );
_ Variant_t varxml = m_sys;
Varout = pxmldoc-> load (varxml );
If (bool) varout = false)
{
Afxmessagebox ("the document format is incorrect and XML document cannot be loaded ");
Return;
}
// If (bool) varout = false) // error message
//{
// Ixmldomparseerrorptr errptr = pxmldoc-> getparseerror ();
// _ Bstr_t bstrerr (errptr-> reason );
// _ Tprintf (_ T ("error: \ n "));
// _ Tprintf (_ T ("code = 0x % x \ n"), errptr-> errorcode );
// _ Tprintf (_ T ("Source = line: % LD; CHAR: % LD \ n"), errptr-> line, errptr-> linepos );
// _ Tprintf (_ T ("error description = % s \ n"), (char *) bstrerr );

//}
Ixmldomnodeptr pnode = NULL;
Ixmldomnodelistptr pnodelist = NULL;
Ixmldomelementptr pdomelement = NULL;
Ixmldomnamednodemapptr pattributemap = NULL;
Ixmldomnodeptr pchildlistnode = NULL;

// Ixmldomnodelist * pixmldomnodelist = NULL;
// Ixmldomnamednodemapptr pattributemap = NULL;
Ixmldomnodeptr pattributenode = NULL;
HR = pxmldoc-> get_documentelement (& pdomelement );
If (pdomelement = NULL)
{

Afxmessagebox ("File Parsing failed! ");
Return;
}
Pdomelement-> get_childnodes (& pnodelist );
Long num = pnodelist-> length;
For (INT I = 0; I <num; I ++)
{
Pnodelist-> get_item (I, & pnode );
If (pnode)
{
// If the connector is not empty, the child element of the parsing class contacts line and bianyaqi.
_ Bstr_t STR = pnode-> nodename; // name of the root sub-element
_ Bstr_t name = _ T ("name ");
BSTR bstrattributename = Name;
Ixmldomnode * pixmldomnode = NULL;
Variant varvalue;
Pnode-> get_attributes (& pattributemap );
Long length = pattributemap-> length ;//
/* Char group [10];
Bool Bgroup = false;
HR = pattributemap-> get_item (1, & pixmldomnode );
If (pixmldomnode)
{
_ Bstr_t S = pixmldomnode-> nodename;
Pixmldomnode-> get_nodevalue (& varvalue );
Strcpy (group, _ bstr_t (& varvalue ));
If (PAPP-> stringtobool (Group ))
Bgroup = true;
}*/
HR = pattributemap-> get_item (0, & pixmldomnode); // return the attribute value of name
If (pixmldomnode)
{
_ Bstr_t S = pixmldomnode-> nodename;
Pixmldomnode-> get_nodevalue (& varvalue );
Char strin [30];
Strcpy (strin, _ bstr_t (varvalue ));
Cont_type step;
Int I = arrayicmp (strin, objcmd );
...... Omitted
}

Hope to help the landlord,
}
}

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.