A detailed analysis file that describes the zip package format downloaded from the Internet. The content is as follows. In fact, the process of decompressing the zip package is to read the file header location from the central directory structure, and then read and go to the header to continue reading the file data. This data is sent to the decompression interface for decompression.
Files stored in arbitrary order. Large. ZIP files can span multiple
Volumes or be split into user-defined segment sizes. All values
Are stored in little-Endian byte order unless otherwise specified.
Overview. ZIP file format:
[File Header 1]
[File data 1]
[Data descriptor record 1]
.
.
.
[File Header N]
[File data N]
[Data descriptor record N]
[Archive decryption header]
[Archive additional data records]
[Central directory structure]
[End of zip64 of the central directory record]
[End of zip64 of the central directory locator]
[End of central directory record]
-File a header --------------------
File Header signature 4-byte [start 0] (0x04034b50)
Required version 2 bytes [start 4]
General purpose bits mark 2 bytes [start 6]
Compression Method 2 bytes [start 8] (8 = deflate; 0 = uncompressed)
The last modification time of the file is 2 bytes [start 10]
The last modification date of the file is 2 bytes [start 12]
Crc-32 4 bytes [start 14]
Size of 4 bytes after compression [start 18]
Size of 4 bytes after decompression [start 22]
File Name Length: 2 bytes [start 26]
The length of the additional field is 2 bytes [start 28]
File name variable
Extra field variable
-File B Data --------------------
The file header is the compressed or uncompressed file data.
Each file in the ZIP archive is a series of duplicates.
[File Header] [file data] [data descriptor record].
-C data descriptor record --------------------
Crc-32 4 bytes [start 0]
Size of 4 bytes after compression [start 4]
Size of 4 bytes after decompression [start 8]
This descriptor exists only if bit 3 of the General
Purpose bit flag is set (see below). It is byte aligned
And immediately follows the last byte of compressed data.
This descriptor is used only when it was not possible
Seek in the output. ZIP file, e.g., when the output. ZIP file
Was standard output or a non-seekable device. For zip64 (TM) Format
Archives, the compressed and uncompressed sizes are 8 bytes each.
When compressing files, compressed and uncompressed sizes
Shocould be stored in zip64 format (as 8 byte values) when
Files size exceeds 0xffffffff. However zip64 format may be
Used regardless of the size of a file. When extracting, if
The zip64 extended information extra field is present
The file the compressed and uncompressed sizes will be 8
Byte values.
Although not originally assigned a signature, the value
0x08074b50 has commonly been adopted as a signature Value
For the data descriptor record. Implementers shocould be
Aware that zip files may be encountered with or without this
Signature Marking Data descriptors and shoshould account
Either case when reading zip files to ensure compatibility.
When writing ZIP files, it is recommended to include
Signature value marking the data descriptor record. When
The signature is used, the fields currently defined
The data descriptor record will immediately follow
Signature.
An extensible data descriptor will be released in a future
Version of this appnote. This new record is intended
Resolve conflicts with the use of this record going forward,
And to provide better support for streamed file processing.
When the central directory encryption method is used, the data
Descriptor record is not required, but may be used. If present,
And Bit 3 of the general purpose bit field is set to indicate
Its presence, the values in fields of the Data Descriptor
Record shoshould be set to binary zeros.
-D archive decryption header --------------------
The archive decryption header is introduced in version 6.2
Of the ZIP format specification. This record exists in support
Of the central directory encryption feature implemented as part
The strong encryption specification as described in this document.
When the central directory structure is encrypted, this decryption
Header will precede the encrypted data segment. The encrypted
Data segment will consist of the archive extra data record (if
Present) and the encrypted central directory structure data.
The format of this data record is identical to the decryption
Header record preceding compressed file data. If the central
Directory structure is encrypted, the location of the start
This data record is determined using the start of central directory
Field in the zip64 end of central directory record. refer to
Section on the strong encryption specification for information
On the fields used in the archive decryption header record.
-E archive additional data records --------------------
4-byte [start 0] (0x08064b50)
Extra field length 4 bytes [start 4]
4-byte extra field [start 8]
The archive extra data record is introduced in version 6.2
Of the ZIP format specification. This record exists in support
Of the central directory encryption feature implemented as part
The strong encryption specification as described in this document.
When present, this record immediately precedes the central
Directory data structure. The size of this data record will be
Specified ded in the size of the central directory Field in
End of central directory record. If the central directory structure
Is compressed, but not encrypted, the location of the start
This data record is determined using the start of central directory
Field in the zip64 end of central directory record.
-F central directory structure --------------------
[File Header 1]
.
.
.
[File Header N]
[Digital Signature]
File Header:
4-byte [start 0] (0x02014b50) of the central file header Signature)
Version made by 2 bytes [start 4]
Required version 2 bytes [start 6]
General purpose bits mark 2 bytes [start 8]
Compression Method 2 bytes [start 10] (8 = deflate; 0 = uncompressed)
The last modification time of the file is 2 bytes [start 12]
The last modification date of the file is 2 bytes [start 14]
Crc-32 4 byte [start 16]
Size of 4 bytes after compression [start 20]
Size of 4 bytes after decompression [start 24]
File Name Length: 2 bytes [start 28]
The length of the additional field is 2 bytes [start 30]
File comment length 2 bytes [start 32]
Disk start Number 2 bytes [start 34]
Internal file attributes 2 bytes [start 36]
External file attributes 4 bytes [start 38]
Related header offset 4 byte [start 42]
File name variable
Extra field variable
File comment variable
Digital signature:
Header signature 4 bytes [start 0] (0x05054b50)
Data size 2 bytes [start 4]
Signature data variable
With the introduction of the central directory encryption
Feature in version 6.2 of this specification, the central
Directory structure may be stored both compressed and encrypted.
Although not required, it is assumed when encrypting
Central directory structure, that it will be compressed
For greater storage efficiency. Information on
Central Directory encryption feature can be found in the section
Describing the strong encryption specification. The digital
Signature record will be neither compressed nor encrypted.
-G central directory record zip64 end --------------------
4-byte [start 0] (0x06064b50) signature at the end of zip64 in the central directory)
The end size of zip64 in the central directory is 8 bytes [start 4]
Version made by 2 bytes [start 12]
Required version 2 bytes [start 14]
Disk count 4 bytes [start 16]
Number of the disk with
Start of the central directory 4-byte [start 20]
Total number of entries in
Central directory on this disk 8 bytes [start 24]
Total number of central directory entries 8 bytes [start 32]
The size of the central directory is 8 bytes [start 40]
Offset of start of central
Directory with respect
The starting disk number 8 bytes [start 48]
Zip64 extensible data sector variable [start 56]
The value stored into the "size of zip64 end of central
Directory record "shocould be the size of the remaining
Record and shocould not include the leading 12 bytes.
Size = sizeoffixedfields + sizeofvariabledata-12.
The above record structure defines version 1 of
Zip64 end of central directory record. Version 1 was
Implemented in versions of this specification preceding
6.2 In support of the zip64 large file feature.
Introduction of the central directory encryption feature
Implemented in version 6.2 as part of the strong encryption
Specification defines version 2 of this record structure.
Refer to the section describing the strong encryption
Specification for details on the version 2 format
This record.
Special purpose data may reside in the zip64 extensible data
Sector field following either a V1 or V2 version of this
Record. To ensure identification of this special purpose data
It must include an identifying header block consisting of
Following:
Header ID-2 bytes
Data size-4 bytes
The header Id field indicates the type of data that is in
Data block that follows.
Data size identifies the number of bytes that follow for this
Data block type.
Multiple special purpose data blocks may be present, but each
Must be preceded by a header ID and data size field. Current
Mappings of header ID values supported in this field are
Defined in appendix C.
-H central directory locator zip64 end --------------------
The zip64 signature at the end of the central directory locator is 4 bytes [start 0] (0x07064b50)
Number of the disk with
Start of the zip64 end
Central directory 4-byte [start 4]
The zip64 end offset of the related central directory record is 8 bytes [start 8]
Disk size 4 bytes [start 16]
-I: End of the central directory record --------------------
4-byte [start 0] (0x06054b50) in the central directory. Note: Use "bubble" to trace the signature from the end of the file.
Disk Number 2 bytes [start 4]
The central directory starts disk Number 2 bytes [start 6]
The total number of entries in the central directory on the disk is 2 bytes [start 8]
The total number of entries in the central directory is 2 bytes [start 10] Note: The total number of files. A folder is also regarded as a file.
The size of the central directory is 4 bytes [start 12]
Offset of the central directory to the first disk 4 bytes [start 16]
. ZIP file comment length 2 bytes [start 20]
. ZIP file comment variable [start 22] Note: bytearray. readmultibyte () must be used here, and "gb2312" must be specified in the second parameter to support Chinese characters.