Detailed design of two-dollar Huffman code decoding based on Python

Last Update:2017-07-23 Source: Internet

Author: User

Tags bmp image scale image

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, design topics

Two-dollar Huffman encoding and decoding for a grayscale image in BMP format (personal ID photo)

Second, the algorithm design

(1) $ Two Huffman code:

①: Image Grayscale Processing:

Using the gray-scale image conversion function of Python's PIL, the color image is first converted to a BMP image of grayscale, at which point each pixel can be represented by a single pixel.

②: $ Two Huffman code:

Program Flowchart:

Detailed design:

Statistical pixel point frequency, first through the Python's own PIL library image Pixel point reading function read () to get all the pixels of the grayscale image, by looping through each pixel point, each occurrence of the pixel value and the second number as a key-value pair into the Python dictionary.

①: First constructs the class that represents the node, where each node includes the member properties:

Self.left = Left

Self.right = Right

Self.parent = Parent

Self.weight = Weight

Self.code = code

②: Iterates through the saved pixel frequency dictionaries, defining all the pixels appearing in the image as leaf nodes, using the class's code and weight to represent the corresponding pixel value and the number of times the pixel appears.

③: At this point, the weight of the leaf node is disorderly, at this time according to each leaf node weights all the leaf nodes from small to large sort;

④: Each of the two nodes with the least number of weights to be replaced, add the weights of the two nodes, then generate a new node, and the two nodes are removed from the leaf node list, and the newly generated nodes are placed in the list of leaf nodes, and the list is sorted.

⑤: Repeat step ④ until a node is left in the list, and this node is the head node.

①: According to the already constructed two-dollar Huffman coding tree, from the leaf node began to traverse the whole tree, the left to the code Word 1, the right to the code word 0, each down one time, if the node is a non-root node, then the code symbol is placed on the left side, if traversing to the root node, The pixel value represented by the leaf node and the corresponding code word are put into the dictionary of code word.

②: Repeat step ① until all the leaf nodes have their corresponding two-dollar Huffman code word.

③: After step ②, at this time the pixel code word dictionary has been generated, at this time to return to the original picture, according to traverse the original image pixel point, in turn, in the code word list to find its corresponding code word, all the pixel points corresponding to the code word stitching together.

4. At this time because of the two Huffman code, the encoding result is 01 string, at this time in order to compress the amount of information, the use of 8 string similar to a byte, first fill the encoding result is a multiple of 8, and increase the number of redundant bits to preserve the value of the last number of digits of the encoded result and its length. Once the padding is complete, only eight bits of the encoded result are converted to a byte to be credited to txt.

(2) $ two Hoffman decoding:

Detailed design:

1. Each time a byte in the TXT is read, it is reverted to a string until all the bytes in the txt are read, and the resulting string is the result of Hoffmann encoding.

① the results of the Huffman encoding, the original pixel points are restored according to the code word list that Hoffman encoded, and the original BMP image is generated based on the restored pixel points.

②: For each of the traversed word nonspacing in the code word list to find, if not found to add a subsequent character, continue to find;

③: Repeat step ③ until you find the corresponding pixel in the code word list, place the pixel value corresponding to the code word in the list of pixels, and repeat the steps above to find the restore pixel value until all the strings have been traversed.

Three, module division

(1) $ Two Huffman coding section:

①: Class:

Class Node:

def __init__ (self, right=none, Left=none, Parent=none, Weight=0, Code=none):

Self.left = Left

Self.right = Right

Self.parent = Parent

Self.weight = Weight

Self.code = code

Function: A class that represents a leaf node

②: function:

Def picture_convert ():

Function: This function completes the function of converting color graph to gray scale graph.

③: function:

def pin_lv_tong_ji (list):

Function: Count the number of occurrences per pixel

④: Functions

def gou_zao_ye_zi (Xiang_su_zhi):

Function: This function is mainly for generating leaf nodes, assigning each node a weight and a pixel value.

⑤: Functions

def sort_by_weight (List_node):

function: Sort the leaf node list according to the weight of each leaf node

⑥: Functions

def Huo_fu_man_shu (ListNode):

function: Generate the corresponding Huffman coding tree according to the list of leaf nodes.

⑦: Functions

def er_yuan_huo_fu_man_bian_ma (picture):

Function: This function is the main function for the two-dollar Huffman code, which is done by calling the other functions to encode the pixel points.

⑧: Functions

Def zi_jie_xie_ru ():

Function: Since Hoffman encodes the result as a string, it should be saved to byte, which completes the byte of the encoded result into

(2) $ Two Hoffman decoding section:

①: function:

def zi_jie_du_qu (QQQQ):

Function: Reads the bytes in the TXT file generated by the Hoffmann code to restore the encoded result as a string

② function:

def er_yuan_huo_fu_man_yi_ma (Kuan,gao)

Function: This is the main function of the two Hoffman decoding, the original BMP image is restored by invoking other functions

Iv. test Data

The test takes picture pixel number size two kinds of BMP graph to do:

Picture 1 Information:

Name: new.bmp

Size: 255 KB (261,366 bytes)

Number of pixel points: 260288

Grayscale wide chart to 448 pixels

Grayscale image height is 581 pixels

Picture 2 Information:

Name: test1.bmp

Size: 12.7 KB (13,078 bytes)

Number of pixel points: 12000

Grayscale image width: 96 pixels

Grayscale Graph Height: 125 megapixels

Six, the test situation and results analysis:

(a) $ Two Huffman coding process:

Image 1 Test Results:

The Huffman code table will be generated after the program is run, and the final encoding results will be credited to the Huo_fu_man_compress.txt in the current program running directory.

Because it is stored in bytes, viewing in txt format will show garbled characters, which is normal condition.

The original image size is 255kb, and the Huffman code eventually sizes to 218kb, reducing the size by nearly 15%.

Image 2 Test Results:

The result of the program operation can be obtained two dollars Huffman run encoding results saved to

In Er_yuan_huo_fu_man_youcheng_compress.txt

Because it is stored in bytes, viewing in txt format will show garbled characters, which is normal condition.

The original image size is 12.7kb, the encoded result file is 9.5kb, and the size is reduced by nearly 26%.

(ii) $ two Hoffman decoding process:

Figure One Test Result:

Restores the original BMP picture according to the huo_fu_man_compress.txt generated by the encoding and saves it as

Er_yuan_huo_fu_man_huan_yuan.bmp

The restored picture matches the original picture and the decoding is successful.

Image 2 Test Results:

Restores the original BMP picture according to the huo_fu_man_compress.txt generated by the encoding and saves it as

Er_yuan_huo_fu_man_huan_yuan.bmp

The restored picture matches the original picture and the decoding is successful.

Results Analysis:

BMP grayscale Image After two Yuan Huffman code, the size of the file can always be reduced, that is, the amount of space to reduce, that is, the effect of compression, compression efficiency will be affected by the frequency of pixel points.

The influence of the number of digits encoded by the equal length code.

Detailed design of two-dollar Huffman code decoding based on Python

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More