Use axiom to promote XML Processing

Source: Internet
Author: User

Introduction:Axis Object Model (axiom) is an XML Object Model of Apache Axis 2. Its goal is to provide a powerful combination of features to completely change XML processing technology. Axiom goes beyond the existing XML processing technology. It combines latency construction with a fast and lightweight customizable object model. In this article, Eran chinthaka, software architect and founder of axiom, introduced this new XML processing method.

AxiomIt is not another object model. It has a clear design goal: to significantly improve the performance of Apache's next-generation SOAP protocol stack Axis 2. The result creates an axiom (also calledOm), Because it highlights the lightweight structure,AndIt is created only when necessary. Because it is lightweight, it can reduce the pressure on system resources as much as possible, especially the CPU and memory. At the same time, the delayed construction allows the use of a part of the tree before other parts are completed. The powerful latency building capability of Axiom comes from the underlying streaming API for XML (Stax) parser. Axiom provides all these features while the complexity behind the scenes is transparent to users.

The results of the xml‑document model benchmark test (see references) show that the performance of axiom is equivalent to that of the existing high-performance object model. However, the memory usage of axiom is better than the existing object model that relies mostly on sax and/or DOM input and output. Therefore, for XML processors such as Web service engines or memory-constrained devices, axiom is an ideal choice for general XML processing, however, there is an optional layer optimized for soap.

Use AxiomIn a typical SOAP engine, data may be represented in three different methods:

    • Serialization format, such as XML or binary XML.
    • Tree-based object models in memory, such as Dom.
    • Objects dedicated to specific languages, such as plain old Java object (pojo ).

For example, a Web service call. The data transmitted to the service provider may be a language-specific object, and the Java technology is pojo. The first step in the call process is to put the information items in these objects into a SOAP envelope to construct a SOAP message. Because soap messages are XML documents, Web services must convert data items into the required XML format. In the memory, the XML infoset object tree needs to be constructed for use by the object model (axiom.

Create axiom from scratchThe first step to creating a memory object hierarchy is to create an object Factory:

Omfactory factory = omdomainactfactory. getomfactory ();

 

Axiom allows many different object factory implementations, but linked lists are the most commonly used. Once a factory is created, you can start to construct the tree.

For example, the following XML snippet:

Listing 1. line item details

<Po: line-item Po: quantity = "2" xmlns: Po = "http://openuri.org/easypo"> <Po: Description> Burnham's celestial handbook, Vol 2 </PO: description> <Po: price> 19.89 </PO: price> </PO: line-item>

 

Note that all elements and attributes belongHttp://openuri.org/easypo"Namespace. Therefore, the first step to construct an axiom tree for this XML segment is to create a namespace, as shown below:

Omnamespace Pons = factory. createomnamespace ("http://openuri.org/easypo", "po ");

 

Now you can construct the package ElementLine-itemNow:

Omelement lineitem = factory. createomelement ("line-item", pons );

 

CreateLine-itemChild elements and attributes related to elements.

It is best to create element attributes in the following way:

Lineitem. addattrity ("quantity", "2", pons );

 

Create child elements like other elements and then combine them into the parent element as follows:

Omelement description = factory. createomelement ("Description", pons); description. settext ("Burnham's celestial handbook, Vol 2"); lineitem. addchild (description );

 

Similarly, addPriceChild element:

Omelement price = factory. createomelement ("price", pons); Price. settext ("19.89"); lineitem. addchild (price );

 

Listing 2 shows the completeCodeFragment.

List 2. PassProgramCreate line item

Omfactory factory = omdomainactfactory. getomfactory ();
Omnamespace Pons = factory. createomnamespace ("http://openuri.org/easypo", "po ");
Omelement lineitem = factory. createomelement ("line-item", pons );
Lineitem. addattrity ("quantity", "2", pons );
Omelement description = factory. createomelement ("Description", pons );
Description. settext ("Burnham's celestial handbook, Vol 2 ");
Lineitem. addchild (description );
Omelement price = factory. createomelement ("price", pons );
Price. settext ("19.89 ");
Lineitem. addchild (price );

 

OutputCurrently, Stax writer can be used to serialize the constructed elements:

Listing 3. serialize line item

Xmloutputfactory XOF = xmloutputfactory. newinstance (); xmlstreamwriter writer = XOF. createxmlstreamwriter (system. Out); lineitem. serialize (writer); writer. Flush ();

 

Construct axiom from existing codeNow let's look at the opposite process to create a memory object model from the data stream.

In the simplest case, you only need to care about deserialization of XML fragments. However, in soap processing, You Need To deserialize soap messages or mime envelopes optimized by MTOM. Because it is closely related to soap processing, axiom provides built-in support for this purpose. We will introduce it in detail later. But first, we need to describe how to deserialize a simple XML segment, specifically the XML segment just serialized.

First, construct a parser. Axiom supports parsing XML with the sax and Stax parser. However, Sax parsing does not allow delayed construction of object models. Therefore, Stax-based parser should be used when delay building is important.

The first step is to obtainXmlstreamreader:

File file = new file ("line-item.xml ");
Fileinputstream FCM = new fileinputstream (File );
Xmlinputfactory xif = xmlinputfactory. newinstance ();
Xmlstreamreader reader = xif. createxmlstreamreader (FCM );

 

Then create a builder andXmlstreamreaderPass to it:

Staxombuilder builder = new staxombuilder (Reader );
Lineitem = builder. getdocumentelement ();

 

Now you can use the Axiom API to access attributes and child elements or XML infoset items. You can access the property as follows:

Omattribute quantity = lineitem. getfirstattribute (New QNAME ("http://openuri.org/easypo", "quantity "));
System. Out. println ("quantity =" + quantity. getvalue ());

 

Access the child element in a similar way:

Price = lineitem. getfirstchildwithname (New QNAME ("http://openuri.org/easypo", "price "));
System. Out. println ("Price =" + price. gettext ());

 

Listing 4 shows the complete code snippet.

Listing 4. Building axiom from an XML file

File file = new file ("line-item.xml ");
 
Fileinputstream FCM = new fileinputstream (File );
 
Xmlinputfactory xif = xmlinputfactory. newinstance ();
 
Xmlstreamreader reader = xif. createxmlstreamreader (FCM );
 
Staxombuilder builder = new staxombuilder (Reader );
 
Omelement lineitem = builder. getdocumentelement ();
 
Lineitem. serializewithcache (writer );
 
Writer. Flush ();
 
Omattribute quantity = lineitem. getfirstattribute (New QNAME ("http://openuri.org/easypo", "quantity "));
 
System. Out. println ("quantity =" + quantity. getvalue ());
Omelement price = lineitem. getfirstchildwithname (New QNAME ("http://openuri.org/easypo", "price "));
 
System. Out. println ("Price =" + price. gettext ());
 
Omelement description = lineitem. getfirstchildwithname (New QNAME ("http://openuri.org/easypo", "Description "));
 
System. Out. println ("Description =" + description. gettext ());

 

The best thing about axiom is that it strives to provide user-friendly APIs for high-end technologies such as latency construction. However, to make full use of its potential, you must understand the underlying architecture.

Back to Top

Further study of Axiom

Caching is one of the core concepts of axiom. However, to understand the buffer, you must think about it in the context of the tree's latency construction and axiom API. Axiom provides multiple APIs to access the underlying XML infoset. The above uses tree-based APIs. All other competing object models, such as Dom and JDOM, provide such APIs. However, axiom allows access to information through the sax or Stax API. 1.

Figure 1. axiom, input and output

If you want to use an XML parsing API, why do you need to construct an object model? To access different parts of the object model using different APIs. For example, considering the soap stack, a SOAP message may be processed by multiple processing programs before it is consumed by the target service. These handlers generally use tree-based APIs (especially soap with attachments APIs for Java, or SAAJ ). Service implementation may also use data binding tools to convert XML documents in the SOAP message load into objects, such as pojo. Because you do not use the tree-based object model to access this part of the document, it is a waste of memory to construct a complete tree because of data duplication. The most direct solution is to expose the underlying raw XML stream to the data binding tool. This is the flash of axiom.

For optimal performance and memory usage, the data binding tool needs to directly access the underlying XML stream. Axiom allows this operation. Delayed construction only means that the tree to be accessed is constructed only when access is made. Therefore, if you do not need to access the SOAP message body, this part of the SOAP message will not be constructed. If you start to access the message body using sax or Stax, and it has not yet been built, axiom connects the user directly to the underlying parser to provide optimal performance. 2:

Figure 2. Access the underlying parser through Axiom

However, problems may occur if you want to return to the same part of the access tree. Because the parser has connected the user directly, the Axiom exits, that is, all information is directed to the user from the lower-layer stream. Therefore, when a user returns to request the same information, axiom cannot provide this information regardless of the API selected for the second time. Note that the two possibilities are almost the same. For example, in most cases, load is only involved in the final service implementation of the soap body. Services can use data binding or other XML processing APIs such as sax, Stax, or XPath to process message bodies. In this case, the message body is rarely accessed twice, and the optimization provided by axiom has the best performance.

However, assume that a log handler is inserted in the handler chain and Stax writer is used to record the entire SOAP message. If the service implementation attempts to access the message body, the message body does not exist!

To further illustrate this, the following is a simple example, though far-fetched.

Staxombuilder builder = new staxombuilder (Reader );
Lineitem = builder. getdocumentelement ();
Lineitem. serialize (writer); writer. Flush ();
Price = lineitem. getfirstchildwithname (New QNAME ("http://openuri.org/easypo", "price "));
System. Out. println ("Price =" + price. gettext ());

 

GetLineitemThe element has not been constructed yet. Therefore, when Stax writer is used for serialization, axiom serializes Stax writer (LineitemElement) is directly connected to Stax reader (it was initially passedBuilder). However, in this process, axiom disconnects itself from the data stream. Now when the requestPriceThis element cannot be found when a child element is created, becauseLineitemAll the sub-elements of are removed from the serializer.

In this case, the only way is to prevent the Axiom from completely dropping from the data stream during serialization. The term axiom is calledBuffer: Whether or not an object model is created in the memory, axiom allows Stax events.OrSerialize XML. Therefore, AxiomPolicy(For example, whether messages should be buffered) andMechanism(How to buffer) separated. It allows users to determine whether to buffer unused parts of the tree for future reference when starting to use the original XML processing API (such as sax or Stax. If you decide to do so, you can return to access these parts when the tree construction is complete. However, you must pay for memory usage and performance. On the other hand, if you understand your goals and are sure that you only need to access these parts of the tree this time, you can chooseCloseBuffer to make full use of the axiom efficiency.

Therefore, the previous Code should be rewritten:

Staxombuilder builder = new staxombuilder (Reader );
Lineitem = builder. getdocumentelement ();
Lineitem. serializewithcache (writer );
Writer. Flush ();
Price = lineitem. getfirstchildwithname (New QNAME ("http://openuri.org/easypo", "price "));
System. Out. println ("Price =" + price. gettext ());

 

MethodSerializewithcacheAnd correspondingSerializeDifferent, Stax reader is not directly connected to Stax writer. Instead, all data transmitted from reader to writer isRetainedIn axiom. The buffer is not related to the user. Currently, if caching is enabled, axiom constructs a tree just like the part of the document API access tree.

Back to Top

Axiom and Stax

After learning about the background, let's take a look at Axio's Stax API. The most important method of this API is as follows:

(Omelement). getxmlstreamreader ();
(Omelement). getxmlstreamreaderwithoutcaching ();

 

Use the Stax API to call the first method for an element. You can access the XML infoset of the element and buffer (if needed) unconstructed parts of the tree for future use. As the name suggests, the second method is used to access the same information, but the performance is optimized by disabling the buffer mechanism. This is the most useful method when writing the stubs and skeleton programs that need to use the data binding framework.

Note that if a tree has been created before the preceding method is called, Axiom will simulate the Stax parser. Therefore, some tree node events are simulated, while others are directly connected to the underlying parser. The advantage of axiom is that these internal processes are transparent to users. However, you must specify whether to buffer data when switching to the original API.

To demonstrate the usage of the Stax API, I will show you how to use the code generated by xmlbeans to connect to axiom.

Listing 5. Order code generated by xmlbeans

Public class purchaseorderskel {public void submitpurchaseorder (purchaseorderdocument DOC) throws exception {}
Public void submitpurchaseorderwrapper (omelement payload ){
Try {xmlstreamreader reader = payload. getxmlstreamreaderwithoutcaching (); purchaseorderdocument Doc = purchaseorderdocument. factory. parse (Reader); submitpurchaseorder (DOC);} catch (exception ex) {ex. printstacktrace ();}}}

 

The code in listing 5 (usually generated using a code generation tool) shows a skeleton, which uses the class generated by xmlbeans (that isPurchaseorderdocument. This skeleton contains two service implementation methods. The first is to allow the Service implementer to use the data binding object, and the second is to allow direct access to the Axiom API. Let's take a look at these lines:

Xmlstreamreader reader = payload. getxmlstreamreaderwithoutcaching (); purchaseorderdocument Doc = purchaseorderdocument. Factory. parse (Reader );

 

To create an object, first obtain reference to The Stax API by pressing the load of the soap stack (such as Apache axis) into the Service implementation. Because it is currently at the end of the processing chain, you can safely connect the parser directly to the xmlbeans unpacker for optimal performance.

For skeleton in listing 5, its stub code is similar to listing 6.

Listing 6. Stub code

Public class purchaseorderstub {public void submitpurchaseorder (purchaseorderdocument DOC) throws exception {soapenvelope envelope = factory. getdefaultenvelope (); xmlstreamreader reader = Doc. newxmlstreamreader (); staxombuilder builder = new staxombuilder (Reader); omelement payload = builder. getdocumentelement (); envelope. getbody (). addchild (payload );//...}}

 

Let's take a look at these lines:

Xmlstreamreader reader = Doc. newxmlstreamreader (); staxombuilder builder = new staxombuilder (Reader); element payload = builder. getdocumentelement ();

 

From this code, we can see that there is no difference between Stax API and axiom from object to axiom.

But what doesn't seem so obvious at the beginning is that the delayed structure is still working! Even if you createOmelementThere are no duplicate information items in the memory. This is due to the delay construction and multi-channel technology in axiom, which forwards data from one API input directly to another API output. When a message is finally written to a stream, xmlbeans providesXmlstreamreaderDirectly connect to the transmission writer, which writes the message to the socket-assuming there is no processing program to view the message during this process. This means that the data is still stored in the xmlbeans object until now, which is really good!

Back to Top

Axiom and Data Binding

The axiom's sax API is discussed here, because some data binding frameworks cannot use other APIs, such as jaxb. Although it is clear that using sax will not achieve optimal performance in the above cases, the performance loss from axiom to the use of sax on objects is not caused, because this step is required in any case.

If jaxb is used, the stub program must useSaxombuilderCreate an axiom instance from a data binding object. Listing 7 demonstrates this process.

Listing 7. axiom and jaxb

Public class purchaseorderstub {public void submitpurchaseorder (purchaseorder DOC) throws exception {soapenvelope envelope = factory. getdefaultenvelope (); saxombuilder builder = new saxombuilder (); jaxbcontext = jaxbcontext. newinstance ("po"); extends aller = jaxbcontext. createmarshaller (); extends aller. marshal (Doc, builder); omelement payload = builder. getdocumentelement (); envelope. getbody (). addchild (payload );//...}}

 

So far, axiom is not allowedOmelementRegister the content handler to process the received sax event. However, it is easy to write a piece of glue code to receive events from the provided Stax interface and drive the saxContenthandler. Interested readers can find this implementation from the jaxb reference implementation in the references.

Back to Top

Conclusion

I introduced some promising features introduced by Axiom compared to the typical XML object model. Note that this article only introduces some features. Axiom has many more powerful features. We recommend that youSource codeLibrary (see references) to download the latest source code and further study axiom.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.