Full parsing of MSXML dom

Source: Internet
Author: User
Tags xml parser

Reprinted please indicate the source

Author: Pony

 

You should obtain the following information from the question:
1. This document does not talk about XML, but only MSXML.
2 MSXML also includes many aspects, such as Dom, som, and XSLT. This article only describes Dom.

Now, start...

 

1. About MSXML dom


What is MSXML?

MSXML is a Microsoft XML parser. If you do not know com, you can understand it. Otherwise, you should know that MSXML is actually a COM component, the so-called COM component can be understood as an independent function module. The client programmer only needs to obtain the component object and then calls the interface to perform operations. you do not need to care about how the components are implemented internally. COM components have many implementation forms. MSXML is implemented in the form of DLL (such as msxml4.dll). In fact, this implementation form is also quite common. com is not the focus of this article, so it is not convenient to say too much (in fact, I cannot say much, hey). If you are interested, let's take a look at Pan's book <com Principles and Applications>

What is Dom?
The XML Document Object Model defines the standard methods and attributes for accessing and operating XML documents. dom uses XML documents as a tree structure, and leaves are defined as nodes. it is a W3C standard. for example, the standard defines the following node types: Document, documentfragment, and processinginstruction.

What is MSXML Dom?
MSXML Dom is a set of Apis implemented by Microsoft according to Dom standards. It is a part of MSXML. The figure below is from msdn, which vividly expresses this relationship:

Figure 1

2. Build an environment

All the analysis in this article, the source code test is based on the following environment: Windows XP, visual studio2005, msxml4.0. the language is C/C ++

To build msxml4.0, you can download the installation package and install it manually. The installation package is actually executed by performing the following operations, copy the three files msxml4.dll msxml4a. dll msxml4r. dll
Register with regsvr32 In the System32 directory. Therefore, you can also manually perform these operations.

Configure the Project Environment
There are two methods to reference msxml4, static link and dynamic link in vs2005.

Static Link
Step 1: # include <msxml2.h>
Step 2: Add msxml2.lib to "properties-link-input.
It should be noted that, in the installation directory of vc8, both msxml2.lib and msxml2.h exist, so you can directly include them in the form of <>.
Dynamic Link
Add the following two lines to the code.

#import <msxml4.dll> raw_interfaces_onlyusing namespace MSXML2;

 

Raw_interfaces_only indicates that the original interface is used instead of the smart pointer encapsulated interface. due to the lack of time-saving import, the automation-compliant interface is automatically generated. if you don't want this, add this sentence.

3. Basic MSXML Dom operations

This article only writes some basic operations to help you understand and get started. For more details about the operations and attributes, refer to msdn.

MSXML Dom can be operated in two ways. One is to use the original interface and the other is to use a smart pointer. The latter uses the smart pointer technology to encapsulate the former, it can automatically process COM Object reference count and dynamic memory management. it is consistent in principle. all the sample code in this section only provides operations on the original interface and uses the dynamic loading method.

1 coinitialize and couninitialize
These two APIs initialize the com library and close the com library respectively. Therefore, com-based applications must call coinitialize before the operation starts, and
Couninitialize is called after the end, And MSXML is no exception, as shown below:

Coinitialize (null); ...... // MSXML-related operations here couninitialize ();

 

2. Create an object

MSXML2::IXMLDOMDocument *pxmldoc = NULL;HRESULT hr;hr = CoCreateInstance(__uuidof(DOMDocument40),  NULL,  CLSCTX_INPROC_SERVER,  __uuidof(MSXML2::IXMLDOMDocument),  (void**)&pxmldoc);  if (FAILED(hr)) {    printf("Failed to CoCreate an instance of an XML DOM\n");}

Description
You may be wondering why msxml2 ::,
If msxml2: is not added, you will find an error similar to "ixmldomdocument ambiguous symbol" during compilation. This is because vc8 already contains the MSXML definition, in the header file MSXML. h already has the definitions of these types. This is the definition conflict. to solve this problem, add the prefix of the namespace.

Ixmldomdocument is a document object that points to the entire XML document. For details, refer to msdn.

 

Cocreateinstance is the stuff in COM. You only need to know that you can get the interface pointer of the Document Object through it. With this interface pointer, you can call the member functions in to perform various operations.

Hresult is a relatively large number of data types used in COM. It is generally used as a function return value. For this type, it is best not to simply judge hR = or HR! =, But with succeeded
Successful. Use failed to identify failure.

 

3. load XML

VARIANT var;VARIANT_BOOL status;VariantInit(&var);V_BSTR(&var) = SysAllocString(L"test.xml");V_VT(&var) = VT_BSTR;pXMLDom->load(var, &status);if (status!=VARIANT_TRUE) {    printf("Failed to load xml\n");    if (&var) VariantClear(&var);     if (pXMLDom) pXMLDom->Release(); }

Description
The load function is defined as follows:
Hresult load (
Variant xmlsource,
Variant_bool * issuccessful );
Both Variant and variant_bool are data types in COM. This is a type defined for cross-platform operations. You need to know that before using the variant variable, you 'd better first
Variantinit is released after variantclear is used up. You may wonder why dynamic memory allocation is needed? Here:
V_bstr (& var) = sysallocstring (L "test. xml ");

 

If a load error occurs, remember to release pxmldom because the addref operation has actually been executed in the cocreateinstance, and you need to release it (a bit dizzy ?)

 

4. Read XML
Assume that the content of the XML document is as follows:

<?xml version="1.0" encoding="utf-8"?> <book>  <name>Fly</name>   <price discount = "80%">23.5</price> </book>

Code:

Ixmldomnode * pnode = NULL; BSTR = NULL; ixmldomelement * pixmldomelement = NULL; If (BSTR) sysfreestring (BSTR); BSTR = sysallocstring (L "book "); BSTR bstrattributename = sysallocstring (L "discount"); pxmldom-> selectsinglenode (BSTR, & pnode); If (! Pnode) {printf ("failed to selectsinglenode \ n"); If (BSTR) sysfreestring (BSTR); If (pxmldom) pxmldom-> release (); return ;} //////////////////////////////////////// //// // pnode-> get_xml (& BSTR ); printf ("book. XML: \ n % s \ n ", _ com_util: convertbstrtostring (BSTR )); //////////////////////////////////////// /// // If (BSTR) sysfreestring (BSTR); If (pnode! = NULL) {pnode-> release (); pnode = NULL;} BSTR = sysallocstring (L "book/price"); pxmldoc-> selectsinglenode (BSTR, & pnode ); if (BSTR) sysfreestring (BSTR); BSTR = sysallocstring (L "discount"); pnode-> QueryInterface (_ uuidof (msxml2: ixmldomelement), (void **) & pelement); pelement-> getattribute (BSTR, & var); // If successful, Var = "80%"

Description
The above code shows the following operations,
1 selectsinglenode gets the root node "book" and calls get_xml to input all the content of the node.
2. selectsinglenode obtains the "Book/price" node of the root node and calls getattribute to obtain the value of "discount" in the node.

 

Note that getattribute is the ixmldomelement method, and ixmldomnode cannot be called directly. ixmldomelement is a subclass of ixmldomnode. ixmldomelement pointers can be converted to ixmldomnode, but ixmldomnode is not recommended for ixmldomelement. generally, the following method is used to obtain ixmldomelement through ixmldomnode.

 

Pnode-> QueryInterface (_ uuidof (msxml2: ixmldomelement), (void **) & pelement );

 

5. write XML

VariantClear(&var);V_BSTR(&var) = SysAllocString(L"75%");V_VT(&var) = VT_BSTR;hr = pElement->setAttribute(bstr, var);VariantClear(&var);V_BSTR(&var) = SysAllocString(L"book.xml");V_VT(&var) = VT_BSTR;hr = pxmldoc->save(var);

Description
Change the value of "discount" to "75%", mainly the SAVE Function. If you do not call save, your modifications to XML are limited to the memory and are not saved to the file.

 

 

Appendix 4
Finally, let's talk about the two questions that we often see online.
Q:

In the actual application of MSXML Dom, is the original interface good or the smart pointer interface good?

Ans:
We recommend that you use the smart pointer interface.
On the one hand, it avoids the frequent handling of addref and release, and the poor processing may easily cause memory leakage. On the other hand, it is obvious from the above examples that,
Smart pointer interfaces are much simpler in some basic operations, such as direct access to attributes.

 

Q: Do you need to be familiar with XML when using MSXML Dom?

Ans:

My experience is that it depends on which application you are using. For example, if you are a C/C ++ engineer, you only need
Using MSXML to perform basic read/write operations on XML documents, you only need to have a rough understanding of XML. For example, you only need to know what you want to operate in XML.
There is no need to read the XML specification in W3C. In fact, there is a simple way to let you know how much you should know. Before talking about Dom in msdn, there is an XML
You just need to understand some of the basic knowledge.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.