First, design topics
Two-dollar Huffman encoding and decoding for a grayscale image in BMP format (personal ID photo)
Second, the algorithm design
(1) $ Two Huffman code:
①: Image Grayscale Processing:
Using the gray-scale image conversion function of Python's PIL, the color image is first converted to a BMP image of grayscale, at which point each pixel can be represented by a single pixel.
②: $ Two Huffman code:
Program Flowchart:
Detailed design:
- Statistical pixel point frequency, first through the Python's own PIL library image Pixel point reading function read () to get all the pixels of the grayscale image, by looping through each pixel point, each occurrence of the pixel value and the second number as a key-value pair into the Python dictionary.
①: First constructs the class that represents the node, where each node includes the member properties:
Self.left = Left
Self.right = Right
Self.parent = Parent
Self.weight = Weight
Self.code = code
②: Iterates through the saved pixel frequency dictionaries, defining all the pixels appearing in the image as leaf nodes, using the class's code and weight to represent the corresponding pixel value and the number of times the pixel appears.
③: At this point, the weight of the leaf node is disorderly, at this time according to each leaf node weights all the leaf nodes from small to large sort;
④: Each of the two nodes with the least number of weights to be replaced, add the weights of the two nodes, then generate a new node, and the two nodes are removed from the leaf node list, and the newly generated nodes are placed in the list of leaf nodes, and the list is sorted.
⑤: Repeat step ④ until a node is left in the list, and this node is the head node.
3.
①: According to the already constructed two-dollar Huffman coding tree, from the leaf node began to traverse the whole tree, the left to the code Word 1, the right to the code word 0, each down one time, if the node is a non-root node, then the code symbol is placed on the left side, if traversing to the root node, The pixel value represented by the leaf node and the corresponding code word are put into the dictionary of code word.
②: Repeat step ① until all the leaf nodes have their corresponding two-dollar Huffman code word.
③: After step ②, at this time the pixel code word dictionary has been generated, at this time to return to the original picture, according to traverse the original image pixel point, in turn, in the code word list to find its corresponding code word, all the pixel points corresponding to the code word stitching together.
4. At this time because of the two Huffman code, the encoding result is 01 string, at this time in order to compress the amount of information, the use of 8 string similar to a byte, first fill the encoding result is a multiple of 8, and increase the number of redundant bits to preserve the value of the last number of digits of the encoded result and its length. Once the padding is complete, only eight bits of the encoded result are converted to a byte to be credited to txt.
(2) $ two Hoffman decoding:
Detailed design:
1. Each time a byte in the TXT is read, it is reverted to a string until all the bytes in the txt are read, and the resulting string is the result of Hoffmann encoding.
2.
① the results of the Huffman encoding, the original pixel points are restored according to the code word list that Hoffman encoded, and the original BMP image is generated based on the restored pixel points.
②: For each of the traversed word nonspacing in the code word list to find, if not found to add a subsequent character, continue to find;
③: Repeat step ③ until you find the corresponding pixel in the code word list, place the pixel value corresponding to the code word in the list of pixels, and repeat the steps above to find the restore pixel value until all the strings have been traversed.
Three, module division
(1) $ Two Huffman coding section:
①: Class:
Class Node:
def __init__ (self, right=none, Left=none, Parent=none, Weight=0, Code=none):
Self.left = Left
Self.right = Right
Self.parent = Parent
Self.weight = Weight
Self.code = code
Function: A class that represents a leaf node
②: function:
Def picture_convert ():
Function: This function completes the function of converting color graph to gray scale graph.
③: function:
def pin_lv_tong_ji (list):
Function: Count the number of occurrences per pixel
④: Functions
def gou_zao_ye_zi (Xiang_su_zhi):
Function: This function is mainly for generating leaf nodes, assigning each node a weight and a pixel value.
⑤: Functions
def sort_by_weight (List_node):
function: Sort the leaf node list according to the weight of each leaf node
⑥: Functions
def Huo_fu_man_shu (ListNode):
function: Generate the corresponding Huffman coding tree according to the list of leaf nodes.
⑦: Functions
def er_yuan_huo_fu_man_bian_ma (picture):
Function: This function is the main function for the two-dollar Huffman code, which is done by calling the other functions to encode the pixel points.
⑧: Functions
Def zi_jie_xie_ru ():
Function: Since Hoffman encodes the result as a string, it should be saved to byte, which completes the byte of the encoded result into
(2) $ Two Hoffman decoding section:
①: function:
def zi_jie_du_qu (QQQQ):
Function: Reads the bytes in the TXT file generated by the Hoffmann code to restore the encoded result as a string
② function:
def er_yuan_huo_fu_man_yi_ma (Kuan,gao)
Function: This is the main function of the two Hoffman decoding, the original BMP image is restored by invoking other functions
Iv. test Data
The test takes picture pixel number size two kinds of BMP graph to do:
Picture 1 Information:
Name: new.bmp
Size: 255 KB (261,366 bytes)
Number of pixel points: 260288
Grayscale wide chart to 448 pixels
Grayscale image height is 581 pixels
Picture 2 Information:
Name: test1.bmp
Size: 12.7 KB (13,078 bytes)
Number of pixel points: 12000
Grayscale image width: 96 pixels
Grayscale Graph Height: 125 megapixels
Six, the test situation and results analysis:
(a) $ Two Huffman coding process:
Image 1 Test Results:
The Huffman code table will be generated after the program is run, and the final encoding results will be credited to the Huo_fu_man_compress.txt in the current program running directory.
Because it is stored in bytes, viewing in txt format will show garbled characters, which is normal condition.
The original image size is 255kb, and the Huffman code eventually sizes to 218kb, reducing the size by nearly 15%.
Image 2 Test Results:
The result of the program operation can be obtained two dollars Huffman run encoding results saved to
In Er_yuan_huo_fu_man_youcheng_compress.txt
Because it is stored in bytes, viewing in txt format will show garbled characters, which is normal condition.
The original image size is 12.7kb, the encoded result file is 9.5kb, and the size is reduced by nearly 26%.
(ii) $ two Hoffman decoding process:
Figure One Test Result:
Restores the original BMP picture according to the huo_fu_man_compress.txt generated by the encoding and saves it as
Er_yuan_huo_fu_man_huan_yuan.bmp
The restored picture matches the original picture and the decoding is successful.
Image 2 Test Results:
Restores the original BMP picture according to the huo_fu_man_compress.txt generated by the encoding and saves it as
Er_yuan_huo_fu_man_huan_yuan.bmp
The restored picture matches the original picture and the decoding is successful.
Results Analysis:
BMP grayscale Image After two Yuan Huffman code, the size of the file can always be reduced, that is, the amount of space to reduce, that is, the effect of compression, compression efficiency will be affected by the frequency of pixel points.
The influence of the number of digits encoded by the equal length code.
Detailed design of two-dollar Huffman code decoding based on Python