First to understand the Huffman tree.
Weighted path Length (WPL): Set two fork tree has n leaf nodes, each leaf node with the right value wk, from the root node to each leaf node length is lk, then each leaf node of the belt weight path length of the sum is:
Optimal binary tree or Huffman tree: WPL The smallest two-fork tree.
The example has five leaf nodes with a weight of {1,2,3,4,5}, which can be used to construct multiple binary trees with different shapes.
Among the results WPL the minimum value is: 33 = (1+2) *3+ (3) *2+ (4+5) * *;
The structure of Huffman tree;
Each time the two binary trees with the smallest weights are combined,
Code:
typedef struct TREENODE *huffmantree;struct treenode{ int Weight; Huffmantree left, right;} Huffmantree Huffman (minheap H) {/ * assumes that h->size weights already exist in H->elements[]->weight */ int i; Huffmantree T; Buildminheap (H); /* Adjust h->elements[] by weight to minimum heap * /for (i = 1; i < h->size; i++)/ * do h->size-1 merge * /{T = malloc ( sizeof (struct TreeNode)); /* Create a new node * /T->left = Deletemin (H); /* Delete a node from the minimum heap as the left dial hand node of the new T * /t->right = Deletemin (H); /* Delete a node from the smallest heap as the right child node of the new T */ t->weight = t->left->weight+t->right->weight; /* Calculate new weights * /Insert (H, T);/* Insert new T into minimum heap * /} t = Deletemin (h); return T;} Overall complexity of O (N logn)
Features of Huffman Tree:
(1) No node with a degree of 1;
(2) Huffman Tree of any non-leaf node of the left and right sub-tree after the exchange is still Huffman tree;
(3) There are 2n-1 nodes in the Huffman tree with n leaf nodes;
(4) for the same set of weights {w1, W2, ..., wn}, are there two huffman trees of different structure?
For a set of weights {1, 2, 3, 3}, two Huffman tree of different structure:
Just a little bit. Huffman code:
Huffman code
For example, given a string, how to encode a character, so that the encoding of the string to the minimum storage space?
[Example] assume a text that contains 58 characters and is Fu Yi by the following 7 characters: A,e,i,s,t, Space (SP), line break (NL), and the number of occurrences of the 7 character is different. How do you encode these 7 characters so that the total encoding space is minimal?
Analysis
(1) ASCII encoding with equal length: 58x8 = 464 bits;
(2) with equal length 3 digit code: 58X3 = 174 bits;
(3) Unequal length coding: the characters with high frequency are shorter, the characters with low frequency can be encoded longer.
How to do unequal length coding? How to avoid ambiguity?
Prefix code prefix code: Any character encoding is not a prefix of another character encoding
Can decode without ambiguity
As you can see, all the characters that need to be encoded are in the leaf node of the constructed tree.
Review notes of data structure--huffman tree, Huffman code