Efficient exchange of XML documents

Source: Internet
Author: User
Tags format object mail net web services zip zip archive file zip extension

XML XML documents tend to be verbose because of their inherent descriptive nature. The result is that the document becomes very long as the data being described increases, and this large document has problems when it needs to exchange with other entities. XML documents are particularly verbose compared to other documents (such as plain text files (flat file) or electronic Data Interchange (EDI)). To illustrate this concept, let's take a look at the following plain text file: john,doe,1587,4/18/2000,1234
Anywhere st.,somecity,az,85222

And look at this XML document:<customers>
<customer customerid= "1587" >
<street>1234 Anywhere St.

If you've ever worked with a lot of XML documents, you wouldn't be surprised that even if the XML document and this comma-delimited plain text file contain the same raw data, the XML document appears to be much larger than the normal text file. After all, XML is a Meta data language (metadata language) (it contains a number of benefits such as parsing, validation, transformations, and so on), so it determines its size much larger than other documents of the same format. Since XML is used more widely as a method of data interchange, the size of the document being exchanged reduces the performance and scalability of the application.

There are a number of ways to minimize the size of XML documents, such as converting elements to attributes (where appropriate), abbreviated elements and attribute names, removing unimportant whitespace, and defining only some content. However, no matter what changes you make, a large amount of raw data will eventually form a large XML document. If you have a lot of megabytes in your XML document, how can you effectively pass them on in your organization or pass them on to other businesses?

One approach would be to split a large XML document into multiple documents, which would work well if it could be segmented, but it would also create some additional complexity and ensure that all documents are sent and received correctly. Even small documents that are separated can form several megabytes of documents due to the large number of data being passed. Now that these potential problems exist, how can we, the XML developers, be more efficient in exchanging XML data? (I agree to play golf.) )

You can use compression technology to speed up the exchange of documents between points. Because XML is a simple form of text, large documents can be compressed into smaller forms. The sample program shown here demonstrates how to do this by putting a developer code. NET component to a ZIP archive file to compress an XML document with a program. Doing so minimizes the size of the file and improves the efficiency of the data exchange.

Although. NET's J # language itself supports compression, but builds to. NET Framework does not support the base Class library. However, there is a component named SharpZipLib that is written entirely from regulatory code that can be used to compress various types of documents (downloaded in www.icsharpcode.net/OpenSource/SharpZipLib/default.asp). SharpZipLib is a written in C #, used in. NET supports a class library of zip, GZip, tar, and BZip2. It is implemented as a assembly, and it can also be used with any. NET language projects in conjunction with.

I've used Sharplibzip's early beta releases in several applications, and I think it's very effective at compressing and decompressing documents. Let's look at how to use the SharpZipLib component to compress an XML document with a program.

Compressing an XML document
Although SharpZipLib was able to perform several types of compression, I decided to use the most widely used ZIP compression format in the sample program because it was well-known and useful. In order for the code to be reused, I wrote a custom class called Zipper. Zipper has a static method called Generatezipfile (which can accept the path that specifies the zip file to be saved) and a ArrayList that includes all the file path sets to compress (see Listing 1).

The Zipper class is an encapsulated class in SharpZipLib called the Zipoutputstream class. You can use zipper to compress multiple files into a simple zip archive file (a file with a zip extension) without having to write any code or effort. This generatezipfile () method is compressed by establishing an instance of a Zipoutputstream class and setting the compression level through its Setlevel property. The highest compression level can be set to 9, while the minimum is 0.

After the compression level is set, the contents of the file specified by the ArrayList (passed in Generatezipfile ()) method are processed. A generated counter (enumerator) enumerates the files in the list one at a time. Each file is loaded into a ZipEntry object that accepts file names and logon hours. The ZipEntry object is then added to the Zipoutputstream object through the Putnextentry () method.

Figure 1. Test Zipper Class
After the file name is added to the zip archive file, the contents are read through a FileStream object. The FileStream (under the System.IO namespace) is used to read the file in bytes into the buffer. You can complete the read operation by calling the read () method in the FileStream object. The bytes in the buffer are written to the Zipoutputstream object through the Write () method. Note the write () method accepts the length of bytes to be written to the data stream and the starting position in the buffer. This procedure applies to every file contained in the ArrayList parameter passed to the Generatezipfile () method. When all entries are added to the zip file, it is saved to the hard disk as a zip file extension.

Listing 2 shows a code for a simple asp.net application that tests the zipper class (see Figure 1). It starts by defining a path to the XML document to be compressed and a path to store the zip file. Although there is only one compressed XML document in this example, the path to other documents can be added to the ArrayList object for compression. After all the file paths have been defined, the static method Generatezipfile () is invoked. Once the zip file is built, an e-mail message is sent to the end user via the class under the System.Web.Mail namespace.

Extract XML document
The ability to compress XML documents can be useful in different situations, but inevitably this happens: someone sends you a compressed document that needs to be expanded before parsing (extracted). This problem can be solved directly by using a class named ZipFile in SharpZipLib. In Listing 3 You can see that there is a static method named Extractzipfile () in the Zipper class that expands the compressed file into a specified directory. The code first establishes a ZipFile instance by passing in a FileStream object (obtained by calling the File.Open () method) into the constructor of the ZipFile class. After the object is established, each zipentry in the zip file is enumerated (enumerate). The getInputStream () method of the ZipFile object is then invoked, which takes a zipentry as an argument to be expanded. The data stream returned from getInputStream () is read into a buffer that is written to the file through a FileStream. When getInputStream () is invoked, the ZipFile class automatically extracts the ZipEntry.

After the Extractzipfile () method is invoked, all compressed files located in the zip file are expanded and stored on the hard disk. In addition, the uncompressed byte stream is written to a MemoryStream object, which is useful when the file is parsed without having to be saved to the hard disk.

Although XML is a lengthy meta data language, large documents can be used. NET components, such as sharplibzip, are compressed into a small document. By compressing these documents, you can shorten the time for document exchange between different entities, resulting in faster processing of data. To try this example of a good compression/decompression code, you can access www.xmlforasp.net/codeSection.aspx?csID=95.

About the Author:
Dan Wahlin (asp.net's Microsoft MVP) is President of Wahlin Consulting LLC and has created XML for ASP.net Developers Web site (Www.XMLforASP.NET), which focuses on how to use XML and Web services under Microsoft's. NET platform. He is also a cooperative trainer and lecturer, and teaches "public and on-site XML and. Net" training courses throughout the United States. Dan is co-author of professional Windows DNA (Wrox), ASP.net Tips, tutorials and code (SAMS), and has an XML for ASP.net developers (Sams) book. His contact method is dwahlin@xmlforasp.net.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Tags Index: