This article mainly introduces the use of zlib module in Python for data compression tutorial, is the basic knowledge of Python introductory learning, need friends can refer to the
In the Python standard module, there are several modules for data compression and decompression, such as Zipfile,gzip, bz2, and so on. Last introduced the ZipFile module, today to talk about the Zlib module.
Zlib.compress (string[, Level])
Zlib.decompress (string[, wbits[, BufSize])
Zlib.compress is used to compress stream data. The parameter string specifies the data stream to compress, and the parameter level specifies the levels of compression, which range from 1 to 9. Compression speed and compression rate is inversely proportional, 1 means the fastest compression, and the lowest compression rate, while 9 indicates the slowest compression speed but the highest compression rate. Zlib.decompress is used to extract data. The parameter string specifies the data that needs to be extracted, and wbits and bufsize are used to set the system buffer size (window buffer) and output buffer size, respectively. Here's an example to illustrate how to use both methods:
?
19#coding=gbk
Import Zlib, Urllib
fp = Urllib.urlopen (' http://localhost/default.html ')
str = Fp.read ()
Fp.close ()
#----Compress the data stream.
STR1 = zlib.compress (str, zlib. Z_best_compression)
str2 = zlib.decompress (str1)
Print Len (str)
Print Len (str1)
Print Len (str2)
#----Results
#5783
#1531
#5783
We can also use the Compress/decompress object to compress/decompress the data. Zlib.compressobj ([level]) and zlib.decompress (string[, wbits[, bufsize]) create compress/decompress indented objects respectively. The use of compression and decompression of data through objects is very similar to the zlib.compress,zlib.decompress described above. However, there is a difference between the two data compression, which is mainly reflected in the operation of a large number of data. If you want to compress a very large data file now (hundreds m), if you use zlib.compress to compress, you must first read the data in the file into memory, and then compress the data. This is bound to battle with too much memory. If you use objects to compress, then there is no need to read all the data of the file at once, you can read a part of the data into the memory to compress, after the compression to write the file, and then read the other parts of the data compression, so the cycle repeats, only to compress the entire file. Here's an example to illustrate the difference between:
?
66#coding=gbk
Import Zlib, Urllib
fp = Urllib.urlopen (' http://localhost/default.html ')
# access to the URL.
data = Fp.read ()
Fp.close ()
#----Compressed data stream
STR1 = zlib.compress (data, zlib. Z_best_compression)
str2 = zlib.decompress (str1)
print ' raw data length: ', Len (data)
print '-' * 30
print ' zlib.compress after compression: ', Len (str1)
print ' zlib.decompress after decompression: ', Len (str2)
print '-' * 30
#----Use compress, decompress objects to compress/decompress data streams
Com_obj = Zlib.compressobj (zlib. Z_best_compression)
Decom_obj = Zlib.decompressobj ()
Str_obj = com_obj.compress (data)
Str_obj + + Com_obj.flush ()
print ' compress.compress after compression: ', Len (str_obj)
Str_obj1 = decom_obj.decompress (str_obj)
Str_obj1 + + Decom_obj.flush ()
print ' decompress.decompress after decompression: ', Len (str_obj1)
print '-' * 30
#----Use compress, decompress objects, block compression/decompression of data.
Com_obj1 = Zlib.compressobj (zlib. Z_best_compression)
Decom_obj1 = Zlib.decompressobj ()
Chunk_size = 30;
#原始数据分块
Str_chunks = [Data[i * chunk_size: (i + 1) * Chunk_size]/
For I in Range ((len (data) + chunk_size)/chunk_size)]
Str_obj2 = ' '
For chunk in Str_chunks:
Str_obj2 + + com_obj1.compress (chunk)
Str_obj2 + + Com_obj1.flush ()
print ' Block compressed after: ', Len (STR_OBJ2)
#压缩数据分块解压
Str_chunks = [Str_obj2[i * chunk_size: (i + 1) * Chunk_size]/
For I in Range ((Len (str_obj2) + chunk_size)/chunk_size)]
Str_obj2 = ' '
For chunk in Str_chunks:
Str_obj2 + + decom_obj1.decompress (chunk)
Str_obj2 + + Decom_obj1.flush ()
print ' chunking after decompression: ', Len (STR_OBJ2)
#----Results------------------------
Raw Data length: 5783
------------------------------
Zlib.compress after compression: 1531
Zlib.decompress after decompression: 5783
------------------------------
Compress.compress after compression: 1531
Decompress.decompress after decompression: 5783
------------------------------
After block compression: 1531
Split block after decompression: 5783
The
Python manual provides a more detailed and specific introduction to the Zlib module and can refer to the Python manual.