A tutorial on data compression using the Zlib module in Python

Source: Internet
Author: User
Tags manual flush in python

This article mainly introduces the use of zlib module in Python for data compression tutorial, is the basic knowledge of Python introductory learning, need friends can refer to the

In the Python standard module, there are several modules for data compression and decompression, such as Zipfile,gzip, bz2, and so on. Last introduced the ZipFile module, today to talk about the Zlib module.

Zlib.compress (string[, Level])
Zlib.decompress (string[, wbits[, BufSize])

Zlib.compress is used to compress stream data. The parameter string specifies the data stream to compress, and the parameter level specifies the levels of compression, which range from 1 to 9. Compression speed and compression rate is inversely proportional, 1 means the fastest compression, and the lowest compression rate, while 9 indicates the slowest compression speed but the highest compression rate. Zlib.decompress is used to extract data. The parameter string specifies the data that needs to be extracted, and wbits and bufsize are used to set the system buffer size (window buffer) and output buffer size, respectively. Here's an example to illustrate how to use both methods:

?

19#coding=gbk

Import Zlib, Urllib

fp = Urllib.urlopen (' http://localhost/default.html ')

str = Fp.read ()

Fp.close ()

#----Compress the data stream.

STR1 = zlib.compress (str, zlib. Z_best_compression)

str2 = zlib.decompress (str1)

Print Len (str)

Print Len (str1)

Print Len (str2)

#----Results

#5783

#1531

#5783

We can also use the Compress/decompress object to compress/decompress the data. Zlib.compressobj ([level]) and zlib.decompress (string[, wbits[, bufsize]) create compress/decompress indented objects respectively. The use of compression and decompression of data through objects is very similar to the zlib.compress,zlib.decompress described above. However, there is a difference between the two data compression, which is mainly reflected in the operation of a large number of data. If you want to compress a very large data file now (hundreds m), if you use zlib.compress to compress, you must first read the data in the file into memory, and then compress the data. This is bound to battle with too much memory. If you use objects to compress, then there is no need to read all the data of the file at once, you can read a part of the data into the memory to compress, after the compression to write the file, and then read the other parts of the data compression, so the cycle repeats, only to compress the entire file. Here's an example to illustrate the difference between:

?

 

66#coding=gbk

Import Zlib, Urllib

fp = Urllib.urlopen (' http://localhost/default.html ')

# access to the URL.

data = Fp.read ()

Fp.close ()

#----Compressed data stream

STR1 = zlib.compress (data, zlib. Z_best_compression)

str2 = zlib.decompress (str1)

print ' raw data length: ', Len (data)

print '-' * 30

print ' zlib.compress after compression: ', Len (str1)

print ' zlib.decompress after decompression: ', Len (str2)

print '-' * 30

#----Use compress, decompress objects to compress/decompress data streams

Com_obj = Zlib.compressobj (zlib. Z_best_compression)

Decom_obj = Zlib.decompressobj ()

Str_obj = com_obj.compress (data)

Str_obj + + Com_obj.flush ()

print ' compress.compress after compression: ', Len (str_obj)

Str_obj1 = decom_obj.decompress (str_obj)

Str_obj1 + + Decom_obj.flush ()

print ' decompress.decompress after decompression: ', Len (str_obj1)

print '-' * 30

#----Use compress, decompress objects, block compression/decompression of data.

Com_obj1 = Zlib.compressobj (zlib. Z_best_compression)

Decom_obj1 = Zlib.decompressobj ()

Chunk_size = 30;

#原始数据分块

Str_chunks = [Data[i * chunk_size: (i + 1) * Chunk_size]/

For I in Range ((len (data) + chunk_size)/chunk_size)]

Str_obj2 = ' '

For chunk in Str_chunks:

Str_obj2 + + com_obj1.compress (chunk)

Str_obj2 + + Com_obj1.flush ()

print ' Block compressed after: ', Len (STR_OBJ2)

#压缩数据分块解压

Str_chunks = [Str_obj2[i * chunk_size: (i + 1) * Chunk_size]/

For I in Range ((Len (str_obj2) + chunk_size)/chunk_size)]

Str_obj2 = ' '

For chunk in Str_chunks:

Str_obj2 + + decom_obj1.decompress (chunk)

Str_obj2 + + Decom_obj1.flush ()

print ' chunking after decompression: ', Len (STR_OBJ2)

#----Results------------------------

Raw Data length: 5783

------------------------------

Zlib.compress after compression: 1531

Zlib.decompress after decompression: 5783

------------------------------

Compress.compress after compression: 1531

Decompress.decompress after decompression: 5783

------------------------------

After block compression: 1531

Split block after decompression: 5783

The

Python manual provides a more detailed and specific introduction to the Zlib module and can refer to the Python manual.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.