Analysis of Sax model based on event parsing

Source: Internet
Author: User
Tags end header implement include interface

The SAX2 parser reads an XML document and then produces events based on special symbols. The SAX2 parser does not actually create a tree structure in memory for the document, which processes the contents of a document and produces related events.

For example, when you do event-based programming, you can create functions that respond to user-defined events (such as onclick events). When using Sax for programming, it is necessary to note that the parser rather than the user generates events.

Consider, for example, a simple document below.

>

Turbowidget


When SAX2 is working on this document, it produces a series of events as follows:

Startdocument ()
Startelement ("Parts")
Startelement ("part")
Characters ("Turbowidget")
EndElement ("part")
EndElement ("Parts")
Enddocument ()

You can think of SAX2 as a pull-feature parser, SAX2 generate events, and then you can handle events yourself. In fact, when SAX2 is parsing a document, Saxxmlreader reads the document and produces a series of events, you can select some events to process.

Create an application framework that applies Sax

The events generated by SAX2 include the following categories:

¨ and XML document content-related events (ISAXContentHandler)

¨ and DTD-related events (Isaxdtdhandler)

¨ event that occurs when an error occurs (Isaxerrorhandler)

To handle these events, you need to implement a related processing class that includes methods to handle related events. You have to deal with the events you want to handle. If you don't want to deal with an event, simply ignore it. In practical applications, we first have to inherit these interfaces, in C + + We can create a class, in this class of methods, we can tell the application to receive an event how to handle. Here's a basic step to building a sax based application:

1. Create header file when using SAX2, we need to use the Dynamic Connection library MSXML.DLL, in order to use the SAX2 interface contained in MSXML, you must include the following code in the header file (typically stdafx.h) of the program:

#import Raw_interfaces_only


using namespace MSXML2;

2. To establish a specific operation (handler) class, SAX2 mainly defines three basic operation classes, which are Isaxcontenthandler,isaxdtdhandler and Isaxerrorhandler respectively.

ISAXContentHandler is used to handle the message that the SAX2 parser produces when parsing the content of the document, ISAXXMLReader registers the instance by means of a method Putcontenthandler. While Isaxdtdhandler is used to process basic messages related to DTDs, ISAXXMLReader registers this instance by means of a method Putdtdhandler. Isaxerrorhandler provides handling of error events that occur when an error is encountered during parsing, ISAXXMLReader registers this instance by means of a method Puterrorhandler

Because these three classes are used to process events, they need to be registered in the interface ISAXXMLReader. But their basic usage is similar, so here we only detail the operation of the interface ISAXContentHandler.

The ISAXContentHandler interface receives events about the content of the document, which is the most important interface needed to implement the SAX application, if the application needs to be notified when it encounters a basic parsing event. ISAXXMLReader registers this instance by means of a method Putcontenthandler, and then ISAXXMLReader uses this instance to report the document based events, such as the start of the element, the end of the element, and the associated string data, and so on. ISAXContentHandler includes a lot of ways: startdocument,enddocument,startelement,endelement and so on. In fact, it contains an abstraction that startxxx and endxxx pairs to create different sets of information. For example, the Startdocument method is invoked at the beginning of the document information, and the method invoked after Startdocument is considered a child of the Document information item (item). Enddocument is called at the end of the document information to indicate the end of the document information. In fact, SAX2 when parsing a document, when in a position in the document, will inspire the corresponding methods, such as when a document started, it will stimulate the startdocument method, in the actual implementation, We can overload the method in the implementation class that we inherit from the ISAXContentHandler class to achieve the processing we want. We can think of these methods as ISAXContentHandler interfaces provided to us. Note that the order in which events are processed is consistent with the location of the information in the document.

At the same time, it should be noted that if we need to deal with these messages in our application, we will inherit the class that handles the messages, such as we just need to process the contents of the document and ignore the processing of the errors in the DTD and parsing process, so we just need to create a new class, This class inherits the ISAXContentHandler interface because many of the event-handling methods are defined in the ISAXContentHandler, and in fact we only need to overload the handling of events we care about and simply ignore events that we don't care about.

For example, we only care about startelement and endelement events, and we assume that the name of the class we are building is cxmlcontentdeal, and our class can be as follows:

Class Cxmlcontentdeal:public ISAXContentHandler
{
Public
Cxmlcontentdeal ();
Virtual Cxmlcontentdeal ();
Virtual HRESULT stdmethodcalltype startelement (
/* [in] * * wchar_t __RPC_FAR *pwchnamespaceuri,
/* [in] */int Cchnamespaceuri,
/* [in] * * wchar_t __RPC_FAR *pwchlocalname,
/* [in] */int cchLocalName,
/* [in] * * wchar_t __RPC_FAR *pwchrawname,
/* [in] */int cchrawname,
/* [in] * * isaxattributes __rpc_far *pattributes);
Virtual HRESULT stdmethodcalltype endelement (
/* [in] * * wchar_t __RPC_FAR *pwchnamespaceuri,
/* [in] */int Cchnamespaceuri,
/* [in] * * wchar_t __RPC_FAR *pwchlocalname,
/* [in] */int cchLocalName,
/* [in] * * wchar_t __RPC_FAR *pwchrawname,
/* [in] */int cchrawname);
}


We can then overload the methods Startelement and endelement to carry out and apply the relevant special processing.

3. Creates a parser from the interface ISAXXMLReader. XmlReader is the main interface for SAX implementations, and XmlReader's role is this. First, developers of XML use this interface to register their implementations of other sax interfaces (such as Contenthandler,dtdhandler,errorhandler, etc.), and XmlReader uses the Setfeature and SetProperty two methods to configure the behavior of the SAX parser, and finally, XmlReader encapsulates the parsing functionality. The sample code is as follows:

isaxxmlreader* prdr = NULL;
HRESULT hr = CoCreateInstance (
__uuidof (Saxxmlreader),
Null
Clsctx_all,
__uuidof (ISAXXMLReader),
(void * * *) &PRDR);


4. To create the appropriate event (handler) processing class, let's assume that we only handle events related to the content of the document. The sample code is as follows:

Cxmlcontentdeal * pMc = new cxmlcontentdeal ();

Note Here Cxmlcontentdeal is the class that inherits the interface ISAXContentHandler.

5. Register the event handler class in the parser with the sample code as follows:

hr = Prdr->putcontenthandler (pMc);

6. To start parsing the document, the sample code is as follows

hr = Prdr->parseurl (URL); file://the URL here refers to the location of a specific XML document

7. Releasing the parser object

Prdr->release ();

The above is the framework structure of the SAX based application, and we can see that the actual event handling is implemented in our inheritance class cxmlcontentdeal, in our example code, whenever a new element in the document starts, the method startelement is activated. Every time an element ends, the method endelement is activated. We can write and apply specific code that is relevant in startelement and EndElement.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.