In the general data structure of the book, the tree behind the chapter, the author will generally introduce Huffman (HUFFMAN)
Tree and Huffman coded. Huffman coding is a Huffman tree application. Huffman coding widely used, such as
The Huffman code is applied in JPEG. First introduced what is Huffman tree. Huffman Tree is also called the best binary tree,
It is a two-fork tree with the shortest length of weighted path. The length of the tree's weighted path is all the leaf nodes in the tree.
The weight is multiplied by its path length to the root node (the Jogen node is 0 layers and the path length of the leaf node to the root node)
Is the number of layers of the leaf node). The weighted path length of the tree is recorded as wpl= (W1*L1+W2*L2+W3*L3+...+WN*LN)
, N weights wi (i=1,2,... N) constitute a two-forked tree with n-leaf nodes, and the corresponding path of the leaf node
Length is Li (i=1,2,... N). Can prove Huffman Tree WPL is the smallest.
Huffman Coding steps:
First, the given n weights {w1,w2,w3,..., Wi,..., Wn} constitute the initial set of n binary trees f= {t1,t2,t3,..., Ti,..., Tn}, each of the two fork-tree Ti in only one weight is the root node of the Wi, its left and right subtree are empty. (in order to facilitate the implementation of the algorithm on the computer, the general also requires the right value of TI, WI in ascending order.) )
In F, the tree with the least weight of two root nodes is selected as the left and right subtree of the newly constructed two-fork tree, and the weight of the root node of the new binary tree is the sum of the weights of the root node of its left and right subtree.
Three, remove the two trees from F, and add the new two-fork tree in ascending order to the set F.
Four, repeat two and 32 steps, until only a binary tree in the set F.
The simple understanding is that if I have a,b,c,d,e five characters, the frequency (that is, the weight) is 5,4,3,2,1, then we first take two minimum weights as the left and right subtree to construct a new tree, that is 1, 2 constitute a new tree, its node is 1+2=3, as shown:
The dotted line is the newly generated node, and the second step puts the newly generated 3 node into the remaining set, so the set becomes {5,4,3,3}, and then, according to the second step, takes the smallest two weights to form a new tree, as shown in the figure:
Then set up a Huffman tree, the following figure:
where each weight is replaced by the corresponding character is the following figure:
So the corresponding encoding of each character is: a->11,b->10,c->00,d->011,e->010
Huffman coding is an unprecedented prefix encoding. Decoding is not confusing. Its main applications in data compression, encryption and decryption and other occasions.