Mathematical path-python computing practice (4)-Lempel-Ziv compression (2)

Source: Internet
Author: User
Tags uncompress

Format characters have the following meaning; the conversion between C and Python values shoshould be obvious given their types. the 'standard size' column refers to the size of the packed value in bytes when using Standard size; that is, when the format string starts with one'<','>','! 'Or'='. When using native size, the size of the packed value is platform-dependent.

All content of this blog is original, if reproduced please indicate the source http://blog.csdn.net/myhaspl/

Format C Type Python type Standard size Notes
X Pad byte No value    
C Char String of length 1 1  
B Signed char Integer 1 (3)
B Unsigned char Integer 1 (3)
? _ Bool Bool 1 (1)
H Short Integer 2 (3)
H Unsigned short Integer 2 (3)
I Int Integer 4 (3)
I Unsigned int Integer 4 (3)
L Long Integer 4 (3)
L Unsigned long Integer 4 (3)
Q Long Integer 8 (2), (3)
Q Unsigned long Integer 8 (2), (3)
F Float Float 4 (4)
D Double Float 8 (4)
S Char [] String    
P Char [] String    
P Void * Integer   (5), (3)

Struct. Pack ( Fmt, V1, V2, ... )

Return a string containing the valuesV1, v2 ,...Packed according to the given format. The arguments must match the values required by the format exactly.

Truct. Unpack ( Fmt, String )

Unpack the string (presumably packedPack (fmt ,...)) According to the given format. The result is a tuple even if it contains exactly one item. The string must contain exactly the amount of data required by the format (Len (string)Must equalCalcsize (fmt)).

Read, compress, and decompress text files. Some code is as follows:

#-*-Coding: UTF-8-*-# lempel-ziv algorithm # code: myhaspl@myhaspl.comimport structmystr = "" print "\ n Read Source File ". decode ("utf8") mytextfile = open('test2.txt ', 'R') try: mystr = mytextfile. read () finally: mytextfile. close () my_str = mystr # code table codeword_dictionary ={}# length of the text to be compressed str_len = len (my_str) # maximum codeword length dict_maxlen = 1 # resolve the location of the Text Segment (the starting point of the next parsing text) now_index = 0 # maximum index of the code table max_index = 0 # compressed data print "\ n generated compressed data ". decode ("utf8") compresseddata = [] while (Now_index <str_len): # Move the step backward mystep = 0 # The current matching length now_len = dict_maxlen if now_len> str_len-now_index: now_len = str_len-now_index # The Lookup Table index, 0 indicates that cw_addr = 0 while (now_len> 0): cw_index = codeword_dictionary.get (my_str [now_index: now_index + now_len]) is not found if cw_index! = None: # Find the code word cw_addr = cw_index mystep = now_len break now_len-= 1 if cw_addr = 0: # No code word is found, add a new Code word max_index + = 1 mystep = 1 codeword_dictionary [my_str [now_index: now_index + mystep] = max_index print "don't find the Code word, add Code word: % s index: % d "% (my_str [now_index: now_index + mystep], max_index) else: # Find the code word, add a new code word max_index + = 1 if now_index + mystep + 1 <= str_len: codeword_dictionary [my_str [now_index: now_index + myst Ep + 1] = max_index if mystep + 1> dict_maxlen: dict_maxlen = mystep + 1 print "find the Code word: % s add Code word: % s index: % d "% (my_str [now_index: now_index + now_len], my_str [now_index: now_index + mystep + 1], max_index )............. my_codeword_dictionary [my_maxindex] = my_codeword_dictionary [cwkey] + cwlaster uncompressdata. append (my_codeword_dictionary [cwkey]) uncompressdata. append (cwlaster) print ". ", uncompress_s Tr = uncompress_str.join (uncompressdata) uncompressstr = uncompress_strprint "\ n write the extracted results to the file .. \ n ". decode ("utf8") uncompress_file = open('uncompress.txt ', 'w') try: uncompress_file.write (uncompressstr) print "\ nthe uncompress.txt is successfully decompressed! \ N ". decode (" utf8 ") finally: uncompress_file.close ()

The following describes how to compress the python explanatory text in the Chinese wiki:


Call this program to compress the compressed file first, and then open the compressed file to decompress it.

$ Pypy lempel-ziv-compress.py python.txt python. lzv

....................

Find the Code word: C add Code word: CP index: 9938

Index: 9939de word: ython add Code word: ython

Find the Code word:

^ Add Code word:

^ H index: 9940

Find the Code word: ttp add Code word: ttp: index: 9941

Find the Code word: // add Code word: // e index: 9942

Find the Code word: dit add Code word: ditr index: 9943

Find the Code word: a. add Code word: a. o index: 9944

Generate compressed data Headers

Write compressed data to a compressed file

................

 

........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ........................................ ......................................

Write the extracted results to the file ..

The uncompress.txt file has been decompressed successfully!

View the compression effect:

$ Ls-l-h

................

-Rw-r -- 1 deep 5.0 K Jul 1 lempel-ziv-compress.py

-Rw-r -- 1 deep 30 K Jul 1 20:55 python. lzv

-Rw-r -- 1 deep 36 K Jul 1 20:57 python.txt

-Rw-r -- 1 deep 36 K Jul 1 uncompress.txt from the above display results can be seen, before not compressed is 36 K, after compression is 30 k

Compress all source code of sqlite 3.8.5

$ Pypy lempel-ziv-compress.py sqlitesrc.txt sqlitesrc. lzv

View the compression effect:

$ Ls-l-h

................

-Rw-r -- 1 deep 3.2 M Jul 1 21:18 sqlitesrc. lzv

-Rw-r -- 1 deep 5.2 M Jul 1 21:16 sqlitesrc.txt

-Rw-r -- 1 deep 5.2 M Jul 1 uncompress.txt

5.2 M before compression and 3.2 M after compression

 


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.