Compressing files and data using the Zip class in J # class libraries through C #

Source: Internet
Author: User
Tags arrays extract zip file include naming convention thread visual studio 2002 win32 visual studio
Data | Compression This article assumes that you are familiar with C # and Windows forms

Download the code for this article: ZipCompression.exe (150KB)

Summary

Using ZIP compression can save space and network bandwidth when storing files or sending files over the network. In addition, the directory structure of the zipped folder is not lost, making it a very useful compression scheme. The C # language does not have any classes that allow you to manipulate Zip files, but because a. NET-oriented language can share class implementations, and J # exposes classes in the Java.util.zip namespace, you can use these classes in C # code. This article explains how to use the Microsoft J # class Library to create a C # application that compresses and compresses a zip file. It will also introduce some other unique parts of the J # runtime that can be used from any. NET compliant language to save some coding effort.



Content of this page
Zip is a popular data transfer and storage standard because it can save disk space and network bandwidth. Typical text and database files can be compressed to 10% of their original size. Even if binary files cannot be compressed in the same way, you can usually get a 50% compression ratio.

One additional advantage of a Zip file is that a single file can contain multiple files, while preserving the directory structure. This allows you to send a full tree of content attached to an e-mail message and to restore the original file structure to the recipient.

The ZIP data format is open and does not involve patent or other legal issues. Developers are free to create applications that manipulate ZIP files and use lower-level zip compression algorithms to temporarily reduce their own custom data size. The author of the ZIP data specification provides a compression and decompression algorithm for developers in a library named Zlib (http://www.gzip.org/zlib). The Java platform uses this library in version 1.1 of the Java Development Kit (JDK) to form the basis of the Java Archive (JAR) file format, so, starting with JDK version 1.1, the standard Java language API contains the classes needed to manipulate the Zip file. These classes can be found under the Java.util.zip namespace.

Zip files and C #


I want to use ZIP compression in an application written in C #. Unfortunately, the Microsoft.NET Framework does not currently contain any classes for manipulating Zip files. However, I did find several products related to Zip compression. For example, #ziplib (formerly known as Nziplib,http://www.icsharpcode.net/opensource/sharpziplib/default.asp) is a zlib library to C # transplant product. Its license allows developers to include the library in a commercial application that encloses the source code. However, at the time the MSDN Magazine was printed, the #ziplib was in pre-release state (version 0.31).

Another solution is to use unmanaged zlib as a Windows DLL and write the necessary Interop wrappers for it, but because compression involves passing large amounts of data around each function call, it is a difficult process to write an Interop wrapper for optimal performance. Although you can use other libraries, they are not free.

Back to the top of the page
Solution


The. NET Framework is designed to take account of language interoperability. All managed components that follow certain rules can be used correctly from any. NET compliant programming language that implements the necessary functionality. The set of rules and language features required for interoperability is called the Common Language Specification (CLS).

All of the. NET language compilers that Microsoft implements are CLS-compliant, including Microsoft Visual J #. Net-a Java language developer who wants to build applications and services on the Microsoft. NET Framework Development tools that people use. (Visual J #. NET was developed independently by Microsoft.) It has not been approved and approved by Sun Microsystems, Inc. This is why the. NET Framework classes can be used in Windows forms and ASP.net applications written in J #.

As you will see later in this article, some classes exposed by the J # Runtime are not actually CLS-compliant, but you can still access most J # classes from other languages to use specific features not implemented by the. NET Framework. Since J # implements JDK version 1.1.4, it is not surprising that developers can access Java.util.zip namespaces through the J # runtime. In the next section of this article, I'll introduce an application written in C # that uses the Java.util.zip class to compress and decompress the zip file to save space locally and save bandwidth on the network.

All the sample code in this article was developed with Microsoft Visual Studio 2002 and J # Runtime version 1.0 (see the link at the top of this article).

Back to the top of the page
Sharpzip


I wrote one of the sample applications included with this article in C # Sharpzip. It is a simplified utility for working with ZIP files, through which you can create zip files or open existing Zip files to extract, attach, and delete files (see Figure 1).



Figure 1 Sharpzip Application


Before you look at the code, you need to make sure that the J # Runtime is properly installed in the system. You do not need to install the complete Visual J #. NET product. You can download and install J # 1.0 Redistributable Package only, which can be obtained from http://msdn.microsoft.com/vjsharp/downloads/howtoget.asp.

The Java.util.zip namespace is implemented in the Vjslib.dll assembly. The assembly is located in the C:\WINNT\Microsoft Visual jsharp. Net\framework\v1.0.4205\ directory (you need to replace WINNT with the actual Windows directory).

When you include references to Vjslib.dll in your project, you can start using the J # namespace from your code and browse the JDK namespace with the Object Browser (see Figure 2). Important classes include Java.util.zip.ZipFile, Java.util.zip.ZipEntry and Java.util.zip.ZipOutputStream. These classes are shown in Figure 3, which allows you to manipulate the Zip file at the file level.



Figure 2 Namespaces in the Object Browser


When you use the methods outlined in this article, the method name may seem strange to you because the naming convention used by Java for identifiers (except classes and interfaces) differs from naming conventions used in C #. In Java, namespaces and method names are written in a low-level case mix, where the first letter is lowercase and the remaining words are capitalized, as shown in "Nextelement." However, I am sure you will master this method.

Back to the top of the page
Enumerate ZIP Entries


The entries method of the Java.util.zip.ZipFile class returns an object that implements the Java.util.Enumeration interface. The application then traverses the enumeration to retrieve the ZipEntry instance that represents each entry in the Zip file. The ZipEntry class exposes all required information, such as file name, compression method, timestamp, original size and compression size, and so on (see Figure 4).

Note that although the Java.util.Enumeration interface is similar to the System.Collections.IEnumerator interface, the Java enumerator advances to the next element when you retrieve the current object by calling Nextelement. NET Enumerator advances When you check the availability of more elements in a MoveNext call. Another important difference is that the enumeration interface does not provide a way to restart traversal.

One advantage of the. NET enumerator is that you can access the current element multiple times. On the other hand, the Java enumerator allows you to check the completion multiple times, but this is not very useful in most cases. Both Java and. NET enumerators have been well designed to prevent you from forgetting to advance to the next element within the enumeration loop.

I decided to write a class to wrap the Java enumerator so that I could use the C # foreach statement with them. I'll name the class Enumerationadapter. I simulate the Reset method by calling again a way to return to the Java enumerator. To do this, the wrapper class constructor takes the delegate of the Java.util.Enumeration interface as an argument instead of the Java.util.Enumeration interface itself.

Back to the top of the page
Extract zip file


The first thing a sharpzip application does when extracting a file is to prompt the user to specify the directory in which the file should be created. You may have noticed that the application displays the Browse for Folder dialog box. I prefer to use the System.Windows.Forms.Design.FolderNameEditor.FolderBrowser class, but the document claims that the type supports the. NET Framework infrastructure and is not suitable for direct use. So I use the Shell32 object with COM Interop by importing the Microsoft Shell Controls and Automation type library.

Extracting the original file from the Zip file (decompression) is simple: simply call the getInputStream on the ZipFile object and pass the entry you want to get the compressed file for. The getInputStream method produces a inputstream so that you can read the contents of an archived entry from it.

The Extractzipfile Helper function completes the work for you. The directory is stored in a Zip file by using a separate entry, but the file name in each entry also contains the directory information, so extractzipfile ignores the directory entries and extracts the necessary path information from the file name.

To save a single file to disk, simply write the inputstream content corresponding to the item you are interested in to the file. This time I decided not to wrap the custom System.IO.Stream class as a Java stream because the Java.io namespace is fairly well supported for streaming. Specifically, Java.io.FileOutputStream allows you to create a file to copy the required entries to.

The CopyStream Helper function in Figure 5 copies the contents of the Java.io.InputStream object to the Java.io.OutputStream object. The Helper function is also used by other parts of the Sharpzip application. However, you should be aware that the example does not check that they already exist before overwriting the output file. You may want to prompt the user by asking if the file should be overwritten.

Also note that there is no support for password-protected files. You can use classes in the System.Security.Cryptography namespace to create your own encryption mechanism. If you do this, note that the resulting file will not be compatible with the standard ZIP utility (for example, WinZip).

Back to the top of the page
Creating and modifying Zip files


The Java.util.zip.ZipOutputStream class allows you to compress data and write the results to the underlying Java.io.OutputStream object. Sharpzip applications are suitable for working with files, so it writes compressed data to a new Java.io.FileOutputStream object, but you can easily derive your own classes from Java.io.OutputStream, Or use one of the standard classes to write compressed data directly to a network or other storage medium.

The Createemptyzipfile Helper function creates a Zip file and closes it immediately. The result is an empty Zip file that does not contain any entries. Appending or deleting items is not as simple as the Java.util.zip package does not provide random access to the zip file. For deleting files, you should copy the entries you want to keep to the new Zip file. For adding files, you should copy all the entries to the new Zip file and append the new entries. Copying an entry involves extracting the entry from the source file in the way I have described it, and then compressing it to the destination file.

Create a new ZipEntry instance for each file you want to add, and call Setmethod on the entry to set the compression method to use. The supported method is zipentry.deflated (which compresses the data using the compression algorithm) and zipentry.stored (it stores the data but does not apply any compression). It then invokes the Zipoutputstream.putnextentry, passing in the new entry, and then writes its data by calling the Write method on the Zipoutputstream object. When you finish processing the current entry, call Zipoutputstream.closeentry and continue with the next entry.

The Updatezipfile function in Figure 5 implements updates and deletes by invoking the delegate for each entry so that you can choose which entries should be copied to the temporary file. Finally, the new entry is added to the Zip file.

Back to the top of the page
Low-level ZIP compression


Using the Java.util.zip class, you can compress not only files, but also application data. To illustrate this, I created a pair of functions to compress and decompress strings using the Java.util.zip.Deflater and Java.util.zip.Inflater classes.

The compression function creates an instance of the Java.util.zip.Deflater class. A parameter in the constructor defines the desired level of compression. Next, I call the Deflater.setinput class, passing the compressed data as a signed byte (sbyte) array, and then call Deflater.finish.

Note that in contrast to C #, the byte data type in Java is not an unsigned byte data type in a signed-java. This is why all methods of processing buffers for the J # Runtime take sbyte arrays as parameters.

Fortunately, the Com.ms.vjsharp.struct namespace contains the Javastructmarshalhelper class, which, in addition to other features, can help you perform array conversions. The compressstring function calls the Converttobytearray method to convert the string to a signed array of bytes. To get the actual compression bit, I just keep calling deflater.deflate until the deflater.finished returns true to indicate that all the input data has been exhausted. I use Java.io.ByteArrayOutputStream instances within the compression loop to collect the resulting data. As a general rule, it is best to use JDK classes when working with Java types in C #. It is the best way to avoid repeatedly converting arrays between sbyte and byte.

The code used to extract the string looks very similar to the code used for compression. This time, create an instance of the Java.util.zip.Inflater class and call the SetInput method, passing in the compressed data. The decompression loop constantly calls Inflater.inflate until the inflate.finished becomes true, indicating that all input data has been decompressed. Finally, call javastructmarshalhelper.converttostring to convert the unsigned byte array to the string to be returned by the function.

The CSZIPLL sample application (LL represents the lower level) creates a long string and compresses it to about half the size. You can use these functions to do some work, such as writing a SOAP extension to reduce the network bandwidth required for a WEB service.

Back to the top of the page
Other attractive features of J #


Although this article focuses on how to work with ZIP files, the principle can also be applied to other areas where the J # runtime provides functionality that cannot be obtained from the. NET Framework Standard Assemblies.

Because J # provides developers with the means to migrate their Visual J + + projects to the. NET Framework, J # also implements a number of features that are specific to Visual J + +, such as J/direct. J/direct technology enables Java language programs to invoke native Windows code. As in Visual J + +, the Com.ms.win32 namespace in J # provides access to most Windows API functions, data types, and constants.

User32, Kernel32, and GDI32 classes contain Win32? The core of the API function. These constants are defined as static fields in some interfaces named Winx (where x is the first letter of a constant). For example, the Sw_show flags for the ShowWindow API can be found in the Com.ms.win32.wins interface.

In order for the interface to be CLS-compliant, it must not contain a field, and the Com.ms.win32.winx interface cannot pass the test. Because C # does not allow fields in interfaces, both the IntelliSense and C # compilers do not see these constants, but you can still access these fields using reflection, as follows:

private int Getwin32intconstant (string name)
{
System.Reflection.Assembly asm =
System.Reflection.Assembly.GetAssembly (typeof (Com.ms.win32.wina));
Type t = asm. GetType ("Com.ms.win32.win" + char.) ToLower (Name[0]),
true);
System.Reflection.FieldInfo info = T.getfield (name);
return int. Parse (info. GetValue (NULL). ToString ());
}

Using this technique to retrieve Windows API constants can be slow, so you should be careful when using this method. Another problem is that because constants are not parsed at compile time, you get run-time errors whenever you misspell them. In any case, declaring most Windows APIs in a. NET assembly can save a lot of work. For example, the Sharpzip sample program displays the system icon associated with the extension of each file. To do this, the code calls the Shgetfileinfo API defined in the Com.ms.win32.Shell32 interface to get the handle to the icon (see Figure 6).

Note that when you create a System.Drawing.Icon object with the handle, the new Icon does not own the handle. This means that you must release the associated resources by calling the DestroyIcon API. Since I do not want to store an icon handle throughout the lifetime of an icon object, I choose to create a build object by using the copy constructor on its handle. A copy of the Icon.

Although the Com.ms.win32 namespace is very large, you should know that it does not contain every Windows API function and data structure. For example, a notable omission of the Com.ms.win32.Shell32 interface is the SHBrowseForFolder API, which allows us to display the Browse for Folder dialog box without using the Microsoft Shell Controls and automation COM Library.

Also note that processing callbacks are a bit complicated because the Java language does not support delegates. For each callback type, an abstract class that defines the prototype of the function is provided. You must derive from the class to implement the code that handles the callback, and then pass an instance of the class to the API call (see Figure 7). Another minor difficulty associated with the Java language is that parameters passed by reference are declared as arrays, but this only affects the code that invokes the functions, without affecting the underlying functionality.

Finally, some API calls are converted very poorly. An example is waveoutopen (defined in the Winmm Class). The Dwcallback parameter is used in C + + to pass event handles, window handles, thread IDs, or callback functions, depending on the value of the Fdwopen parameter. Because the J/direct wrapper declares the Dwcallback parameter as Int32 and does not typecast the callback (delegate) to the Int32, you must use a different notification mechanism, such as an event handle, window handle, or thread ID.

There are some other interesting things in the core J # package. For example, the Java.math.BigDecimal and Java.math.BigIntegers classes allow you to manipulate any large number, which can be useful when you write your application to handle cryptographic algorithms or scientific calculations.

The Csmath sample project shows how to use Java.math.BigDecimal to calculate Pi with any number after the decimal point by using the Machin formula. To make the code easier to read, I wrapped the java.math.BigDecimal in my BigDecimal class and defined the most commonly used operators.

Back to the top of the page
Application Deployment


Applications that use this technology require the J # Runtime and the. NET Framework to be installed on the target computer. Like the. NET Framework, Microsoft provides a redistributable package that can be deployed with application setup.

Microsoft has indicated that it will continue to support J # for desktop operating systems. However, there is no support for the. NET Compact Framework in current J #, so you cannot apply the techniques described in this article to applications that target smart devices. Copying an assembly to the local project directory is not valid because the J # Runtime assembly is extremely dependent on native calls. However, you can take full advantage of the J # Runtime for Web applications that use mobile Web controls.

Back to the top of the page
Summary


The J # Runtime contains a number of useful classes that can be used from other languages in the. NET Framework. Some of these classes allow you to work with Zip files, perform high-precision math calculations, or invoke the Windows API. Although most of this functionality can be obtained by using the third Third-party, the J # Runtime is fully supported by Microsoft and is free!

Related articles, see:
Java 911:parlez-vous j/direct?
For background information, see:
http://msdn.microsoft.com/vjsharp/
What is the Common Language specification?

Ianier Munoz is a dokumenta software architect and analyst, a consulting firm based in Luxembourg. He also created Chronotron and other popular software. You can contact him through http://www.chronotron.com.

Go to original English page





Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.