Automated real-time validation of XML files via XML Catalog

Source: Internet
Author: User

Introduction

XML Catalog implements the ability to validate XML files in real time based on XSD. Users do not have to write programs, with a small amount of configuration can be in the editing of XML files in time to get timely feedback (need to be in the XML editor file writing), to achieve real-time verification.

However, in the practical application, because of the different environment and the XML file itself, the configuration of the manual mode does not meet the requirements. For example, in practice, the author finds that the XML Catalog that is manually configured in the development environment cannot be stored in a running environment. Moreover, many actual XML writing methods do not adopt the standard format, which also brings a lot of inconvenience to the use of XML Catalog.

In this paper, an example is given to illustrate how to implement automated real-time validation of XML files based on XSD by extending XML Catalog.

XML Catalog Introduction

XML catalog is based on the implementation of the OASIS XML Catalog specification standard, which provides some control over how XML files reference external resources. The WTP of Eclipse provides the functionality of the XML Catalog to enable real-time validation of XML files using the schema. XML catalog is an XML file consisting of entries from one or more Catalog entry files that holds the XML file to be validated and the mapping of the XSD file corresponding to the file, which can be automatically correlated at run time to enable validation of the XML file.

The following is an example of how XML catalog is related to concepts.

Introduction to XML Catalog related concepts
 <?xml version= "1.0"?> <! DOCTYPE Catalog Public "-//OASIS/DTD Entity Resolution XML Catalog v1.0//en" "http://www.oasis-open.org/committees/ Entity/release/1.0/catalog.dtd "> 1 <catalog xmlns=" Urn:oasis:names:tc:entity:xmlns:xml:catalog "> 2 <  Group prefer= "Public" xml:base= "file:///usr/share/xml/" > 3 <public publicid= "-//OASIS//DTD DocBook XML v4.5//en"  4 uri= "Docbook45/docbookx.dtd"/> <system systemid= "HTTP://WWW.OASIS-OPEN.ORG/DOCBOOK/XML/4.5/DOCBOOKX.DTD" 5 uri= "Docbook45/docbookx.dtd"/> <system systemid= "DOCBOOK4.5.DTD" 6 uri= "Docbook45/docbookx.dtd"/> </ Group> </catalog> 
    1. DOCTYPE indicates that this file is an OASIS XML catalog file. If there is no Internet connection, the entire DOCTYPE declaration needs to be deleted or commented out. Because the catalog processor tries to download the Catalog.dtd file from the network, it is clear that the processor will be unable to find the file because it is not in the network environment.
    2. The catalog element contains the contents of the catalog and the namespace identity of the catalog.
    3. The group element is a wrapper element that can set the properties of all the catalog entries contained within this group. Property prefer= "Public" indicates that the catalog resolver (parser) takes precedence over the public identifier before using the SYSTEM identifier. The Xml:base property indicates that all URIs are relative to this path. In this case, uri= "DOCBOOK45/DOCBOOKX.DTD"/is a relative path, the absolute path should be File:///usr/share/xml/docbook45/docbookx.dtd ".
    4. The public element maps publicid to the path of the URI. publicID is a unique identifier for a resource, usually the namespace of the file.
    5. The system element maps SystemID to the path of the URI. SystemID, like publicID, is also a unique identifier for a resource, usually the full path of a resource file in the file system.
XML Catalog principle

The XML Catalog provides a mechanism for relocating resources that can relocate XML-referenced artifacts, including URI addresses and namespace names, to another address. Typically, this mechanism is used to relocate remote reference resources to local or web. The XML catalog is a file that describes the mappings of external entity references and locally cached identical entities.

In actual development production, XML files often refer to external files, which are usually represented by URIs, which are most widely used in URLs. But if it is an absolute URL, it will only work if your network is able to access it, and if there is a problem with the network, it cannot be accessed. When it is a relative URL, for example ".. /.. /xml/dtd/docbookx.xml "only works if your file system and the definition are consistent.

One solution is to use the Entity parser (Resolver) or the URI parser (URI Resolver), which allows the parser to locate the resource by examining the URI of the resource. By configuring XML catalog, the user manually specifies the local address of the XSD file referenced by the XML file, the URI parser finds the corresponding XSD through the mappings in the XML catalog, and the last XML Catalog processor finds an XSD pair of XM through the parser L Perform the calibration.

In layman's point, the XML catalog links the XML file with its corresponding XSD file through a namespace, locates the XSD file by the parser, and finally verifies it through the processor.

Unlike Javax.xml.validation's method of validating XML through XSD, XML catalog can validate all XML files referencing this XSD by Namespace to achieve the effect of batch checking. For example, if the A.xml,b.xml,c.xml is checked by d.xsd, the three files can be verified by this namespace as long as the namespace of the d.xsd is configured well.

Configuring XML Catalog Manually

Manually configuring XML Catalog is simpler and easier to understand, and can be done directly in the Eclipse development environment. You can configure it by selecting Editing and Validating XML files, Examples, New, and so on. As shown in the following:

Figure 1. Open the Configuration interface for XML Catalog

You can see from Figure 1 that there are two types of XML catalog entity, user-defined entities, and plug-in entities. A manually generated configuration belongs to a user-configured entity.

Figure 2. Create a new XML Catalog Entry

A new location in the XML Catalog Entry indicates where the XSD file is located, where we choose to originate from the workspace.

Figure 3. Select Student.xsd file from workspace

After selecting OK, the XML Catalog Entry is created. The XML Catalog automatically fills in the other two properties, as shown in.

Figure 4. The complete XML Catalog Entry

, the location tag specifies the position of the XSD, which is the relative path to the workspace, and the key Type tag indicates that here we are using the name of the namespace as an association of XML and XSD; the value of the key tag is the value of the XML namespace.

Figure 5. The complete XML Catalog Entry

As you can see from Figure 5, the new XML catalog Entry is already in the User Specified Entries directory. Figures 1 through 4 show the manual configuration steps for the XML Catalog. Below, we open an XML file in the XML editor and look at the functionality of the XML Catalog.

Figure 6. The correct XML file

First, open the Sample.xml file with an XML editor.

Figure 7. Real-time validation results for XML Catalog

To adjust the age label of the student named BB, you can see a red dot on the right side of the editor and a red wavy line underneath the student element, prompting the student element to have an error. This is, moving the mouse over the student element, you can see the specific message, the student element is incomplete, missing the age element.

As can be seen, the application of XML Catalog real-time validation of XML files, configuration steps are relatively simple, just provide the corresponding XSD file and location, you can associate them, to achieve automatic verification.

Back to top of page

Extending XML Catalog Contributions

In practice, if you configure XML catalog in the Eclipse development environment by using the manual methods described in the previous section, these configurations will not take effect in the running environment. Usually, however, we need the configuration of the running environment. For example, we are going to develop gadgets that allow users to edit XML files, and we have built in some XSD files to verify that the XML files that the user edited are correct. Because users are not allowed to reconfigure when they use the product (because they do not need to know the existence of these XSD for the user, there is no additional burden on the user). Therefore, we need to configure the XSD information in the development environment and have these configurations take effect at run time. Through the research, we find that if you want to use the configuration of the development environment directly in the running environment, you need to extend the XML catalog contributions extension point.

We'll step through the steps to extend the XML Catalog contributions extension point in the Eclipse plugin.

1. Open the Extensions tab of the Plugin.xml file and click the Add button to add the extension point.

Figure 8. Add extension point

2. In the Extension Point filter text box, enter the name of the extension, org.eclipse.wst.xml.core.catalogContributions, and select the extension point and click the OK button.

Figure 9. Select an extension point

Figure 10 shows the Extension page after the extension point was successfully added.

Figure 10. Add extension Point successfully

3. Create a new catalogcontribution

Figure 11. New Catalogcontribution

4. Create a new public

Figure 12. New public

5. Fill in the property information for public. Public has two properties, publicID, and URIs.

publicID is the namespace (namespace), where the URI is the physical location of the XSD file, and you can select the XSD file by using the Browse button.

Figure 13. Fill in the public details

Through the above 5 steps, the extension of XML catalog contributions is completed. Figure 14 is the Plug.xml file content after completing the above configuration. From this file, we can clearly see the corresponding configuration of the extension.

Figure 14. Plugin.xml file

As can be seen from Figure 15, unlike the manual configuration, the User Specified Entries has no content inside the runtime environment.

Figure 15. XML Catalog configuration for the run environment

You can see the sample.xml of the XML Catalog by changing the error.

Figure 16. Sampe.xml file

With the above steps, we demonstrate how to extend the XML Catalog. In the next section, we will show how to extend the URI Resolver to implement validation of special XML files.

Extended URI Resolver

Although XML Catalog provides powerful functionality, some XML files cannot be validated due to the complexity of the actual production environment. For example, some XML files, because of the non-standard writing, and there is no namespace, there are some XML files, due to the application of the environment, it uses the xsi:schemalocation element, the referenced XSD file directly to the virtual path, resulting in the location of XML catalog function lost Effect. In the first case, this is not supported because XML Catalog is designed to correlate XML and XSD through namespaces. For the second case, we can do this by extending the XML Catalog.

Before extending the URI Resolver, we categorized the types of XML and XSD that the XML catalog can handle so that you can clearly see what the XML catalog solves.

xsd \ xml has a name space No namespaces
There is a namespace, no xsi:schemalocation OK No
Has a namespace, has a xsi:schemalocation, and its value is where the XSD actually exists OK No
Has a namespace, has a xsi:schemalocation, and its value is not where the XSD actually exists OK No
No namespaces No No
Introduction to URI Resolver

Uriresolver is responsible for locating the resource, and the XML Catalog processor finds the appropriate XSD based on the Uriresolver resource location, and then verifies it.

Special XML File

As shown in 18, the Sample.xml file has a xsi:shemalocation attribute that positions the XSD position to http://www.sample.com/sample/schemas/student.xsd. Normally, the XML catalog will go to that location to find the XSD, however, this is an invalid address in this article, so the XML catalog will not work because the XSD cannot be found.

Figure 18. XML file extension org.eclipse.wst.common.uriresolver.resolverExtensions with schemalocation tags

The resolverextensions extension point allows users to register their own URI Resolver to extend the functionality of the default Resolver. As with the default Resolver, user-extended Resolver can also be called by editors, validators, and wizards.

Figure 17. Add extension Point resolverextensions

As shown, the extension point has three attributes, Class (name), stage (phase), and priority (precedence). class specifies the classes that implement the Org.eclipse.wst.common.uriresolver.internal.provisional.URIResolver interface; stage specifies at which stage the resolver (parser) is run , there are three values, prenormalization, postnormalization and physical, respectively, and the default is physical. Prenormalization indicates that the parser runs before the format of the input parameters is uniform, postnormalization that the parser runs before the format of the input parameters is uniform, physical represents all the pre and postnormalization Run after the parser. Priority specifies the execution precedence of the parser on the same stage. Priority is divided into high (advanced), Medium (intermediate) and low (lower) three, which is intermediate by default.

Figure 18. Fill in the properties of the extension point

To add the contents of the Plugin.xml file for the resolverextensions extension point.

Figure 19. Plugin.xml file Content Implementation Uriresolverextension

1. First, create a new class in the plug-in to implement the Uriresolverextension interface. The Uriresolverextension interface provides a resolve method for repositioning an XSD resource, which is to find a valid XSD. The method has four parameters, the first type is IFile, the file is in the workspace (workspace), the second argument is a String baseloaction, it is the file's absolute path to the file system, and the third parameter is publicID, which is the file namespace , the last parameter is String SystemID, which is the actual path to the file, and for the XML file, its systeid is empty. But since we expanded the XML catalog in the 2nd section, for publicid configured in the XML catalog file, its systemid is the value of the Uri inside the file, which is the actual path to the XSD file. The return value of the function is the location of the XSD file referenced by the XML file.

Listing 1. Implementing the Uriresolverextension Interface
public class Myuriresolverextension implements Uriresolverextension {public  myuriresolverextension () {  }  @Override Public  String Resolve (IFile file, String baselocation, String publicid,  string systemid) {  return null;  }  }
Listing 2. Add Catalog to Catalog Manager
Icatalog Catalog = Xmlcoreplugin.getdefault (). Getdefaultxmlcatalog (); if (catalog = = null) {return null;}
Listing 3. Relocate Resources
if (myresolved = = null) {  if (publicid! = null) {  if (systemid! = null && systemid.endswith ("Student.xsd" ))//$NON-nls-1$  {  try {  int index = Systemid.lastindexof ("/");  if (Index >-1)   SystemID = systemid.substring (index);  myresolved = Catalog.resolvepublic (publicID, SystemID);  } catch (malformedurlexception me) {  myresolved = null;  } catch (IOException IE) {  myresolved = null;  }  }  }  }

As can be seen from listing 3, we use the Resolvepublic method of the catalog manager to find the value of the URI corresponding to publicid in the catalog by publicID, which is the path to the XSD file.

Summary

This paper first introduces the XML Catalog related concepts and basic principles, so that readers have a preliminary understanding of it. Next, a simple example of how to manually configure the XML Catalog is introduced to give the reader a better understanding. Finally, by extending the XML Catalog and URI Resolver two extension points to achieve more advanced features, so that the reader has a deep understanding of the technology. XML catalog is a more common method to solve the problem of XML real-time verification, especially the combination of XML catalog and URI Resolver Two extension points to solve some special XML file real-time verification problems, often can achieve a multiplier effect.

(RPM) implements automated real-time validation of XML files via XML Catalog

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.