JAXP verification-use the new features of JAXP 1.3 to verify XML

Source: Internet
Author: User
Tags xml schema validator

JAXP Verification

Use the new features of JAXP 1.3 to verify XML

Level: Intermediate

Brett McLaughlin (brett@newInstance.com), author/Editor, O 'Reilly media, Inc.

November 03, 2005

The latest version of Java programming language Java 5.0 includes the improved and extended Java API for XML Processing (JAXP) version. JAXP mainly adds a new verification API, which provides better interactivity, supports XML schema and Relax NG, and can be modified at the same time of verification. These improvements provide Java developers with an industry-intensive XML verification solution. This article details this new API, including basic features and more advanced features.

Java API for XML Processing (JAXP) has been a stable and dull API for several years. This is not a bad thing. Being dull often means reliability, which is always good for software. However, the slowness of JAXP has made developers no longer look for new features. From Java 1.3 to 1.4, in addition to supporting the latest versions of the sax and Dom specifications (see references), JAXP has not significantly changed. But in Java 5.0 and JAXP 1.3, Sun has greatly expanded JAXP. In addition to supporting XPath, verification is also worth mentioning. This article describes the verification features of JAXP 1.3 in detail.javax.xml.validationPackage.

Brief historical review

Ubiquitous Mode

In this article (and in general ),Schema)Any constraint model that follows an XML format. XML schema is a schema, but it is not necessarily an XML schema (defined according to W3C specifications ). For example,ModeIt can also be used in the Relax NG mode. General PurposeModeIt makes it easier to reference a specific method (XML-based constraint model) without being limited to specific implementations.

Before you fully understand the details of such API verification, you must fully understand how the verification was completed before JAXP 1.3. In addition, it is clear that sun will still support the past DTD verification methods, but we recommend that you use a new mode-based verification API. Therefore, even if you want to usejavax.xml.validationAnd you still need to understand how to use the DTD to verify the document.

Create a parser factory

In general JAXP processingFactory.SAXParserFactoryUsed for Sax parsing,DocumentBuilderFactoryIt is used for Dom parsing. Both factories use static methods.newInstance()Create, as shown in Listing 1.

Listing 1. Creating saxparserfactory and documentbuilderfactory

// Create a new SAX Parser factorySAXParserFactory factory = SAXParserFactory.newInstance();// Create a new DOM Document Builder factoryDocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

Open Verification

One factory, Multiple Resolvers

The options set for the factory affect all Resolvers created by the factory. If you usetrueCallsetValidating(), It explicitly tells the factory that all the Resolvers created must be verified. Remember, this is easy to happen: open verification in the factory, forget this setting after writing 100 lines of code, and forget that the generated parser is verified.

AlthoughSAXParserFactoryAndDocumentBuilderFactoryThere are different features and properties that are suitable for Sax and Dom respectively, but for verification, they all have a common method:setValidating(). As expected, to enable verification, you only needtrueTo this method. However, the factory is used to create a parser instead of directly parsing documents. After creating a factory, you can callnewSAXParser()(SAX) ornewDocumentBuilder()(DOM ). Listing 2 shows that both methods enable verification.

Listing 2. enable verification (DTD)

// Create a new SAX Parser factorySAXParserFactory factory = SAXParserFactory.newInstance();// Turn on validationfactory.setValidating(true);// Create a validating SAX parser instanceSAXParser parser = factory.newSAXParser();// Create a new DOM Document Builder factoryDocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();// Turn on validationfactory.setValidating(true);// Create a validating DOM parserDocumentBuilder builder = factory.newDocumentBuilder(); 

In either case, you will get an object that can parse the XML and verify the XML During the parsing process (SAXParserOrDocumentBuilder). But rememberOnlyLimited to DTD parsing.setValidating(true)The call has no effect on XML-based parsing.

Back to Top

Introduction to javax. xml. Validation

Five years ago, it was enough to open the DTD verification with a pretty method. Even two years ago, schema languages like XML schema and Relax NG were still busy solving their own problems. However, today, pattern verification documents are as common as DTD. These two methods exist at the same time largely because legacy documents still use DTD. In the next few years, DTD will disappear like lisp and become a historical relic rather than a mainstream technology.

JAXP 1.3 is introducedjavax.xml.validationPackage Support Mode verification has aroused great repercussions among developers. This package is easy to use and compact, and has become a standard component of the Java language. Better yet, if you have used Sax and Dom through JAXP, it is easier to master how to verify it. The model is similar, and you will find it easy to use this API for verification.

Use schemafactory

With a brief historical review, you know that the first step in using Sax is to create a newSAXParserFactory. If Dom is used, createDocumentBuilderFactory. Therefore, it is not surprising that you must first create a mode for verification.SchemaFactoryAs shown in listing 3.

Listing 3. Creating schemafactory

import javax.xml.XMLConstants;import javax.xml.validation.SchemaFactory;...SchemaFactory schemaFactory =      SchemaFactory.newInstance(XMLConstants.W3C_SCHEMA_NS_URI);

This is similar to the creation of other factories, but addsnewInstance()Method parameters. You must pass the constant defined in another class to this method, that isjavax.xml.XMLConstantsClass. This class defines all constants used in the jaxp application, but now you only need to know two:

  • Used for Relax NG ModeXMLConstants.RELAXNG_NS_URI
  • For W3C XML SchemaXMLConstants.W3C_XML_SCHEMA_NS_URI

BecauseSchemaFactoryIs associated with a specific constraint model, so this value must be provided during factory construction.

SchemaFactoryClass has several other options. These content will be introduced later in the in-depth understanding and Verification Section. For general XML verification, the default factory is enough.

Verify the Mode

Use the source, Luke

Despite the majestic title and a pair of Chinese characters, the title is actually in the whole JAXPSourceInterfaces are very important. This interface is derived from XML Conversion Processing and has become an input standard for various JAXP structures. This is the case if Java I/O classes are not directly used. If you have never usedSourceFor more information about XML Conversion, see references.

After the factory is created, you also need to load the required consumer set. You can usenewSchema()Method. Howeverjavax.xml.transform.SourceTherefore, an intermediate step is required to convert the modeSource. This process is simple, as shown in Listing 4.

Listing 4. From constraints to schema

import javax.xml.XMLConstants;import javax.xml.transform.Source;import javax.xml.transform.stream.StreamSource;import javax.xml.validation.SchemaFactory;import javax.xml.validation.Schema;...SchemaFactory schemaFactory =     SchemaFactory.newInstance(XMLConstants.W3C_SCHEMA_NS_URI);Source schemaSource =      new StreamSource(new File("constraints.xml"));Schema schema = schemaFactory.newSchema(schemaSource);

These codes are very intuitive if you are familiar with JAXP. In Listing 4, a file named constraints. XML is loaded. You can use any method to obtainSourceData in, includingSAXSourceAndDOMSource) Read constraints, or even use URLs.

Once you getSourceTo pass it to the factorynewSchema()Method. The returned result isSchema. Now, it is easy to verify the document. See listing 5.

Listing 5. Verify XML

import javax.xml.XMLConstants;import javax.xml.transform.Source;import javax.xml.transform.stream.StreamSource;import javax.xml.validation.SchemaFactory;import javax.xml.validation.Schema;import javax.xml.validation.Validator;...SchemaFactory schemaFactory =      SchemaFactory.newInstance(XMLConstants.W3C_SCHEMA_NS_URI);Source schemaSource = new StreamSource(new File("constraints.xml"));Schema schema = schemaFactory.newSchema(schemaSource);Validator validator = schema.newValidator();validator.validate(new StreamSource("my-file.xml"));

There is no major change here. It is easy to know the class to be used and the method to be called. You must useValidatorClass. AvailablenewValidator()Method slaveSchemaObtain the instance of this class. Finally, you can callvalidate()And pass it againSourceImplementation, but this time it represents the XML to be parsed and verified.

After this method is called, the target XML is parsed and verified. Remember to useDOMSourceProvides XML (parsed XML Representation), and the parsing may happen again. Verification is still closely linked with resolution, so the verification process takes a little time.

If an error occurs, an exception is thrown, indicating that the problem has occurred. Most implementations of JAXP include row numbers and sometimes column numbers to help locate locations that violate the constraints model. Of course, throwing an exception is not necessarily the best way to solve the problem. I will introduce a better method in the next section.

It seems to work a lot: get the factory, get the mode, get the validators. It is entirely possible for JAXP to provide a factory method to accomplish this, for examplevalidate(Source schema, Source xmlDocument)This method. However, modularity has some benefits. In the next section, we will see that it is used at the same time.SchemaAndValidatorClass, which can solve some very strange situations in XML processing. If you do need to write it yourself, you may wish to use it as a good exercise!

Back to Top

In-depth understanding and Verification

For many applications, the content described above is enough. You can give the input document and mode a method for verification. SimpleExceptionIt tells you that you have encountered a problem and even provides some basic information to solve the problem. For applications that use XML as the data format, it may only pass some information, and the JAXP verification function may be sufficient.

However, we live in a world where XML editors, files, code generators, and Web services are everywhere. For such applications, XML not only plays a secondary role,YesBasic verification is often not enough for the application itself. JAXP provides many features for such applications, which will be discussed below.

Handling error

First, people think thatExceptionIndicates that an exception has occurred. However, for XML-based applications, file verification failure may not be an exception at all, but only one possible result. For example, an XML editor or IDE is supported. In these environments, invalid XML should not cause system crash or shutdown. In additionExceptionIt is too heavy to report errors.

Of course, this is not new to JAXP veterans, and you may have become accustomedSAXParserOrDocumentBuilderProvideorg.xml.sax.ErrorHandler. Three methods provided by this interfacewarning(),error()AndfatalError()It simplifies error handling in parsing. Fortunately, the same facility is available for XML verification. It is better to use the same interface. This is exactly the case,ErrorHandlerThe interface is as useful as parsing in verification. Listing 6 provides a simple example.

Listing 6. Handling verification errors

import javax.xml.XMLConstants;import javax.xml.transform.Source;import javax.xml.transform.stream.StreamSource;import javax.xml.validation.SchemaFactory;import javax.xml.validation.Schema;import javax.xml.validation.Validator;import org.xml.sax.ErrorHandler;...SchemaFactory schemaFactory =     SchemaFactory.newInstance(XMLConstants.W3C_SCHEMA_NS_URI);Source schemaSource = new StreamSource(new File("constraints.xml"));Schema schema = schemaFactory.newSchema(schemaSource);Validator validator = schema.newValidator();ErrorHandler mySchemaErrorHandler = new MySchemaErrorHandler();validator.setErrorHandler(mySchemaErrorHandler);validator.validate(new StreamSource("my-file.xml")); 

Like sax, you can use this interface to customize error handling. This allows the application to exit verification, print error messages, and even try to recover from the error and continue verification. If you are familiar with this interface, you do not need to learn it again!

Load multiple modes

One seterrorhandler ()

If you readjavax.xml.validationJavadoc of the package, which may be noticedSchemaFactoryAndSchemaClasssetErrorHandler()Method. IfSchemaFactorySet the exception handler to processnewSchema()The parsing mode error during the call. Therefore, this is part of the authentication API, but it is not applicable to mode verification errors but for mode resolution errors.

In some rare cases, it may need to be constructed from multiple modes.SchemaObject. This is a bit confusing;Schema NoCorresponds to a mode or file. On the contrary, this object represents a group of constraints. These constraints can come from one file or multiple files. Therefore, you can usenewSchema(Source[] sourceList)IsnewSchema()Method providesSourceImplement arrays (representing multiple constraints ). The returned result is stillSchemaObject, indicating the combination of the provided modes.

It is expected that many errors will occur in this case. Therefore, we recommend that youSchemaFactorySetErrorHandler(For more information, see error handling ). Problems may occur in many places, so you must be prepared to solve the problem when it appears.

Integrate verification into Parsing

So far, verification has been taken as an independent part of resolution. But not necessarily. GetSchemaObject, you can assign itSAXParserFactoryOrDocumentBuilderFactory, All passsetSchema()Method (see listing 7 ).

Listing 7. Integrate verification into resolution

// Load up the documentDocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();// Set up an XML Schema validator, using the supplied schemaSource schemaSource = new StreamSource(new File(args[1]));SchemaFactory schemaFactory = SchemaFactory.newInstance(  XMLConstants.W3C_XML_SCHEMA_NS_URI);Schema schema = schemaFactory.newSchema(schemaSource);// Instead of explicitly validating, assign the Schema to the factoryfactory.setSchema(schema);// Parsers from this factory will automatically validate against the//   associated schemaDocumentBuilder builder = factory.newDocumentBuilder();Document doc = builder.parse(new File(args[0])); 

Note:NoNeed to usesetValidating()Open the verification explicitly. AnySchemaNonullThe parser created by the factorySchemaFor verification. As expected, verification errors are reported to the parser.ErrorHandler.

Back to Top

Important warnings

Although it looks good, I think it is not good enough. There are some serious problems with JAXP's new verification API. First, even in Java 5.0 and JAXP 1.3 official versions, I found many errors and strange behaviors. New APIs are still being supported by the parser, which means that some (rarely used features) are only partially implemented (sometimes not implemented at all ). I have found that many times documents that can be verified by an independent validator such as xmllint (see references) cannot pass JAXP verification.

Direct useValidatorClass andvalidate()MethodSchemaAssignedSAXParserFactoryOrDocumentBuilderFactoryIt seems more reliable than others. We recommend that you use a safer method. Instead of asking you to bypass this API, I suggest you use as many sample documents as possible, check the verification results twice, and be careful when handling errors.

Back to Top


Frankly speaking, JAXP does not have any obvious new things to verify the API. You can continue to use SAX or Dom to parse and verify XML, and combineErrorHandlerClass. Through clever programming, verification errors can be processed in real time. However, you need to have a full understanding of Sax and spend a lot of time testing and debugging and carefully managing the memory (if you finally create the DOMDocumentObject ). This is exactly where the JAXP authentication API flash. It provides a carefully tested and ready-to-use solution, not just a switch that enables mode verification. It is easy to combine with existing JAXP code, and it is very easy to add mode verification. I believe that Java developers who have been using XML for a long time will surely find some advantages of JAXP verification.



  • For more information, see the original article on the developerworks global site.

  • "JAXP full introduction, Part 1" and "JAXP full content, Part 2" (developerworks, 1st): Brett McLaughlin wrote two articles about JAXP, describes how to use this API to parse and verify features and support XSLT conversion.
  • "New Features of JAXP 1.3, new features of JAXP 1st" and "New Features of JAXP 1.3, new features of JAXP 2nd" (developerworks, November 2004 and December) go deep into the new features of JAXP 1.3.
  • "Tips: verification and the sax errorhandler interface" (developerworks, November June 2001): learn more about the verification feature andErrorHandlerInterface.
  • "Install and configure the xerces2 Java parser" (developerworks, November July 2002): this tutorial by Nicholas chase describes how to use xerces-J for mode verification.
  • Sun's Java technology and XML headquarters: a good start for JAXP.
  • Java 2 platform Standard Edition 5.0 API specification: JAXP javadoc is now integrated with Java 5.0 core API documentation.
  • Simple API for XML (SAX): learn more about the APIS behind JAXP. First, start with Sax 2 for Java.
  • W3C Document Object Model (DOM): Take a look at another XML view supported by JAXP, Dom.
  • Apache xerces2 Java Parser: Sun uses the xerces parser in its JDK 5.0 implementation.
  • Getting started with developerworks XML: If you need a more basic introduction to XML, there are a lot of useful references here, including Doug Tidwell's tutorial "getting started with XML" (developerworks, August 2002 ).
  • Ibm xml Certification: Learn How to become an IBM-certified XML and related technology developer.

Obtain products and technologies

  • Java 2 platform Standard Edition 5.0: if you are not familiar with Java programming, you can download JAXP and the complete JDK.

  • Libxml2: libxml2 is an xml c parser and Toolbox developed for the gnome project. This includes the xmllint verification program.


  • Participate in Forum discussions.

About the author

Brett McLaughlin has been using computers since the log age. (Remember the triangle ?) In recent years, he has become the most popular author and programmer in the Java and XML communities. He used to implement complex enterprise systems at Nextel communications, write application servers at lutris technologies, and recently started at o''reilly media, Inc. continue to write and edit books in this area. His latest bookJava 5.0 Tiger: A developer's notebookIs the first monograph on the latest version of Java technology, classic worksJava and XMLIt is still one of the authoritative writings on the use of XML technology in Java.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.