In practical applications, there are sometimes three options for processing zip compression and decompression:
- Call rar.exe, unzip.exe, etc.
- Use a ready-made Library
- Full handwriting
First, although the task can be completed, the results cannot be known. Someone once said that the command line output results can be captured to determine ...... This kind of behavior that relies on the Interface text for accurate judgment is considered quite unreliable. Third, since I am a "wheel maker", I certainly say yes, but now I don't know the ZIP format or the zip algorithm, so I will talk about it later. Today we are going to use the wheel practically.
Zlib and infozip (unzip60) are among the most famous Zip-related libraries. I don't know much about infozip, and its outer interface does not seem very good either. A bunch of callbacks-Callbacks are very annoying and used to disrupt the code structure. In addition, this library has not been updated for many years. It is always uncomfortable for people who have been using it for too long. The latest version of zlib is 1.2.5, which was released in April 19 this year. To be exact, zlib may not be a library for ZIP files, but a library for gzip and deflate algorithms. It provides an example called minizip (contrib/minizip) to show how to operate the ZIP file. Starting from zlib, We will summarize two dummies:
Bool zipcompress (maid, maid );
Bool zipextract (maid, maid );
Source file to be introduced
- Code in the zlib main directory, except minigzip. C and example. C;
- Code under contrib/minizip, except minizip. C and miniunz. C.
Related APIs
Although minizip is more like an example, besides its main program minizip. C and miniunz. after C, the rest can be seen as an upper-layer library of zlib, which encapsulates operations related to the ZIP file format. Minizip. C and miniunz. C are what we want to rewrite -- change it from the command line program to the aforementioned dummies interface. The following APIs are used in minizip. C and miniunz. C:
Compression problems:
- Zipopen64
- Zipclose
- Zipopennewfileinzip
- Zipclosefileinzip
- Zipwriteinfileinzip
Decompression problems:
- Unzopen64
- Unzclose
- Unzgetglobalinfo64
- Unzgotonextfile
- Unzgetcurrentfileinfo64
- Unzopencurrentfile
- Unzclosecurityfile
- Unzreadcurrentfile
You can guess how to use these names. A good interface can bring people pleasure. Some of these functions in minizip include "64", some are not, and some are "2", "3", and "4. Always use 64, not "2", "3", or "4.
Operations
The code for all the operations involved below can be found on the http://zlibwrap.codeplex.com/(Change set 2450 ). No longer long code will be added here. In addition, there is a dll version and a lib version for the attacker.
First, the compression operation. Use zipopen64 to open/create a zip file, and then start to traverse the file to be put into the compressed package. For each file, zipopennewfileinzip is called once, and then the original file data is read. zipwriteinfileinzip is used to write the data to the ZIP file. The third parameter of zipopennewfileinzip is a zip_fileinfo structure. All data in this structure can be set to zero. dosdate can be used to enter a time (lastmodificationtime ). Its second parameter is the file name in the ZIP file. To maintain the directory structure, the parameter can retain the path, such as Foo/bar.txt.
The decompression is a little more complicated. After opening a zip file, you must first use unzgetglobalinfo64 to obtain some information about the file to know how many files are contained in the package. Currently, we need the number of files. Then, traverse the ZIP file. It is automatically located in the first file at the beginning. After processing the first file, use unzgotonextfile to jump to the next file. For each internal file, you can use unzgetcurrentfileinfo64 to query the internal file name. The file name is the same as the second parameter zipopennewfileinzip, so it may contain the path. It may also end with a path separator (/), indicating that this is a directory item (in fact, during the compression operation, you can also write such an internal file for the Directory, which is not done above ). Therefore, create (Multi-Level) directories as needed. The third parameter of unzgetcurrentfileinfo64 is the unz_file_info64 structure, which also contains the dosdate information to restore the file time. For non-directory internal files, use unzopencurrentfile to open the file, and then unzreadcurrentfile to read the file content and write it into the real file. Unzreadcurrentfile returns 0, indicating that the file reading has ended.
Limitations
- Only ZIP files using the deflate algorithm can be compressed and decompressed. (However, this type of zip should account for the vast majority)
- Due to the restrictions on the related APIs in minizip and the zip file format, the compressed/decompressed file names must match the current code page of the system. (Although the option to use utf8 to encode the file name is added to the last update of the ZIP format, it is not guaranteed that all the encountered ZIP files are in the new format, minizip does not seem to have any action on this option .)
Conclusion
This is a vulgar article with no ideology. It's just a note. If you have any mistakes, please criticize and correct them.
Http://www.cppblog.com/Streamlet/archive/2010/09/22/127368.aspx