Detailed description of zip file format (1) -- file data format
----------------------------------------------------------------------------------
Document Description
Zip compression is one of our common compression formats. He has many users around the world with its versatility and high compression ratio. This article briefly introduces the ZIP file format and algorithm. This article mainly refer to the appnote.txt file provided by the http://www.pkware.com, you can download the appnote.zip from the http://www.pkware.com/download.html to get this file.
Sleep Day ([email protected])
2002-10-28 16:32:25
This document serves only as a technical reference. I try my best to ensure that the content of this document is consistent with that of the original technical document in terms of structure and description. If there is any improper content, please forgive me.
The authors of this document are not responsible for any losses caused by the use of this document information.
----------------------------------------------------------------------------------
Common Format of a zip file
----------------------
A zip file consists of three parts:
Compressing the source file data area + compressing the source file directory end mark
1. compress the source file data area
In this data area, each compressed source file/directory is a record. The record format is as follows:
[File Header + file data + Data descriptor]
A. File Header Structure
Composition Length
File Header mark 4 bytes (0x04034b50)
Pkware version 2 bytes required to decompress the file
2 bytes
Compression Method 2 bytes
Last modified file time 2 bytes
Last modified file date 2 bytes
CRC-32 validation 4 bytes
Size after compression 4 bytes
Uncompressed size 4 bytes
File Name Length: 2 bytes
Extended record length 2 bytes
File Name (uncertain length)
Extended field (indefinite length)
B. file data
C. Data Descriptor
Composition Length
CRC-32 validation 4 bytes
Size after compression 4 bytes
Uncompressed size 4 bytes
This data descriptor only exists when the 3rd bits marked in the global mode are set to 1 (see the following description), immediately after the last byte of the compressed data. This data descriptor is used only when the output ZIP file cannot be searched. For example, in a zip file on a drive that cannot be retrieved (such as a tape drive. This data descriptor is not generally available for ZIP files on disks.
2. compress the source file directory
Each record in this data area corresponds to a data record in the data area of the compressed source file.
Composition Length
4 bytes (0x02014b50)
Pkware version 2 bytes used for compression
Pkware version 2 bytes required to decompress the file
2 bytes
Compression Method 2 bytes
Last modified file time 2 bytes
Last modified file date 2 bytes
CRC-32 validation 4 bytes
Size after compression 4 bytes
Uncompressed size 4 bytes
File Name Length: 2 bytes
Extended field length 2 bytes
File comment length: 2 bytes
Disk start Number 2 bytes
Internal File Attribute 2 bytes
External file attribute 4 bytes
Partial header offset 4 bytes
File Name (uncertain length)
Extended field (indefinite length)
File comment (uncertain length)
3. End mark of the compressed source file directory
Composition Length
Directory end mark 4 bytes (0x02014b50)
Current disk Number 2 bytes
Directory start disk Number 2 bytes
Total number of records on this disk: 2 bytes
Total number of records in the directory area: 2 bytes
Directory size 4 bytes
The offset of the first disk in the directory area is 4 bytes.
ZIP file annotation length: 2 bytes
ZIP file annotation (uncertain length)
Forward: Detailed description of zip file format (1) -- file data format