Python zip file compression _python

Source: Internet
Author: User
Tags zipinfo pkware
From a simple point of view, the ZIP format is a good choice, and Python's support for the ZIP format is simple enough to be useful.
1) Simple application
If you just want to use Python for compression and decompression, then you don't have to go through the document, and here's a simple usage that you can see.
Import ZipFile
f = zipfile. ZipFile (' ', ' W ', ZipFile. zip_deflated)
F.write (' file1.txt ')
F.write (' File2.doc ')
F.write (' File3.rar ')
F.close ()
F.zipfile.zipfile (' filename ')
F.extractall ()
F.close ()
Don't know if the above example is simple enough?
1.1 ZipFile. ZipFile (filename[, mode[, compression[, AllowZip64]])
There is no doubt about filename.
mode and normal file operations, ' R ' means opening a read-only zip file that exists; ' W ' means emptying and opening a write-only zip file, or creating a write-only zip file; ' A ' means opening a zip file and adding content.
Compression represents a compressed format, with only 2 optional compression formats: Zip_store; Zip_deflated. Zip_store is the default, means no compression, zip_deflated compression, if you do not know what is deflated, then I suggest you to make up a lesson.
ALLOWZIP64 is true to support 64-bit compression, which is generally used when the compressed file is greater than 2G, which is false by default because the UNIX system is not supported.
1.2 Zipfile.close ()
Seriously, there's nothing to say, and if there is, it's that any file you write will not actually be written to disk until it is closed.
1.3 Zipfile.write (filename[, arcname[, Compress_type])
Acrname is the name of the file in the compressed file, and by default it is the same as filename
Compress_type exists because the zip file allows compressed files to have different compression types.
1.4 Zipfile.extractall ([path[, member[, password]]]
Path uncompressed directory, there's nothing to say
File name list that member needs to extract
Password you need this option when a zip file has a password
For simple applications, so much is enough.
2) Advanced Application
2.1 zipfile.is_zipfile (filename)
Determine if a file is a compressed file
2.2 Zipfile.namelist ()
Return to List of files
2.3 (name[, mode[, password])
Open a file in a compressed document
2.4 zipfile.infolist ()
2.5 Zipfile.getinfo (name)
The above file returns the Zipinfo object, except that one returns a list, and one returns a Zipinfo
Zipinfo class
2.6 Zipinfo.filename
2.7 Zipinfo.date_time
The return value is formatted as (Year,month,date,hour,minute,second)
2.8 Zipinfo.compress_type
2.9 zipinfo.comment
2.13zipinfo.reserved always 0
2.22zipfile.testzip ()
Check each file and its corresponding CRC, if there are errors to return the corresponding file list
2.23zipfile.setpassword (password) (Name[,password])
Returns the corresponding file
2.25zipfile.printdir ()
Printing information for compressed folders
2.26zipfile.writestr (zipinfo_or_arcname, bytes)
Pyzipfile class
ZipFile. Pyzipfile In addition to the above methods and properties, there is a special way
2.27pyzipfile.writepy (Pathname,basename)
In general, just compress. PYc and. pyo files without compressing the. py file
ZIP file format information
A ZIP file consists of three parts: compressed source file data area + Compressed source file directory area + compressed source file directory end flag
1) Compressed source file data area
In this data area, each compressed source file/directory is a record, the format is as follows: [File header + File Data + data descriptor]
A, file header structure
Composition length
File Header Mark 4 bytes (0X04034B50)
PKWare version 2 bytes required to extract the file
Global Way bit tag 2 bytes
Compression Mode 2 bytes
Last modified file time 2 bytes
Last modified file date 2 bytes
CRC-32 Check 4 bytes
Size 4 bytes after compression
Uncompressed size 4 bytes
FileName length 2 bytes
Extended record length 2 bytes
FileName (indefinite length)
Extended field (indefinite length)
b, File data
C, Data descriptor
Composition length
CRC-32 Check 4 bytes
Size 4 bytes after compression
Uncompressed size 4 bytes
This data descriptor is only present at the 3rd digit of the global mode bit mark (see after detail), immediately after the last word of the compressed data. This data descriptor is used only when retrieving the output's ZIP file. For example, in a ZIP file on a drive that cannot be retrieved (for example, on tape drives). If the zip file on disk does not normally have this data descriptor.
2) Compressed source file directory Area
In this data area, each record corresponds to a data in the compressed source file data area
Composition length
File header tags in directory 4 bytes (0X02014B50)
Compression use of PKWare version 2 bytes
PKWare version 2 bytes required to extract the file
Global Way bit tag 2 bytes
Compression Mode 2 bytes
Last modified file time 2 bytes
Last modified file date 2 bytes
CRC-32 Check 4 bytes
Size 4 bytes after compression
Uncompressed size 4 bytes
FileName length 2 bytes
Extended field length 2 bytes
File Comment length 2 bytes
Disk Start number 2 bytes
Internal file properties 2 bytes
External file Properties 4 bytes
Local Head offset 4 bytes
FileName (indefinite length)
Extended field (indefinite length)
File annotations (indefinite length)
3) Compressed source file directory end flag
Composition length
Directory end tag 4 bytes (0X02014B50)
Current Disk number 2 bytes
Directory area start disk number 2 bytes
Total number of records on this disk 2 bytes
Total records in the Directory area 2 bytes
Directory Area size 4 bytes
The directory area offsets the first disk by 4 bytes
ZIP file Comment length 2 bytes
ZIP file annotation (indefinite length)
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.