Huffman Tree and its application

Source: Internet
Author: User
Tags dashed line

Huffman Tree is also known as the optimal binary tree, Huffman Tree, one of the most important application is Huffman coding, this article through a simple question to illustrate the origin of Huffman coding, and Huffman tree method to construct Huffman coding, finally solve the problem to better understand the application of Huffman tree-Huffman coding.

First, the Primer

In the study we often meet to change the results of the subjects to excellent, good, medium, pass and fail. Then according to the classification principle, the code is expressed as:

if(a< -) b="failed ";Else if(a< -) b="Pass";Else if(a< the) b="Medium";Else if(a< -) b="Good";Else if(a< -) b="Excellent";

Then the binary tree is represented as:

In practical applications, the distribution of 5 student grades is shown in the table

Scores 0~59 60~ 69 70~79 80~89 90~100
The percentage 5% 15% 60g 30% 10%

In the above 70 points above the proportion is 80%, but all need three times to compare the judgement to arrive the result, obviously unreasonable.

Huffman put forward ideas and added changes into

Ii. definition and principle of Huffman

According to the above example, we will be the proportion of each score, as a weight on the branch,

Huffman tree is defined as: given n weights as n leaf nodes, the structure of a binary tree with the right path length is minimized.

1. Path and path length

In a tree, the pathway between a child or descendant node that can be reached from a node, called a path. The number of branches in the pathway is called the path length. If the number of layers of the root node is specified as 1, the path length from the root node to the L-level node is L-1. In the binary tree a the root node to the node D path length is 4, in the binary tree b node D-to-root path length is 2.

Tree path length: the sum of the path length of the root to each node. Binary Tree a = 1+1+2+2+3+3+4+4= 20. Two fork Tree b = 1+2+3+3+2+1+2+2 = 16.

2. The right of the node and the length of the path with the right

If you assign a node in a tree to a value that has a meaning, the value is called the node's right. The belt-weighted path length of a node is the product of the length of the path from the root node to the node and the right of the node.

3, the length of the tree with the right path

The length of the tree's weighted path is defined as the sum of the weighted path lengths of all leaf nodes.

Calculation of the weighted path length of the tree

Binary tree A = 5*1+15*2+40*3+30*4+10*4 = 315;

Binary Tree b = 5*3+15*3+40*2+30*2+10*2 = 220;

If we now need to convert with 10,000 students, binary tree a needs 31500 (not in the percentage, 315/100*10000), binary tree B takes 22,000 times, almost one-third less.

From the definition of Huffman tree can be seen one of the most important features: with the shortest length of the path.

Huffman Tree Construction

Huffman Coding steps:

First, the given n weights {w1,w2,w3,..., Wi,..., Wn} constitute the initial set of n binary tree f= {t1,t2,t3,..., ti,..., Tn}, where each binary tree ti has only one weight of the root node of Wi, its left and right subtree are empty. (in order to facilitate the implementation of the algorithm on the computer, it is generally required that the weight of ti in the ascending order of WI.) )
Secondly, in F, the tree with the minimum weight of two root nodes is used as the left and right subtree of the new two-fork tree, and the weight of the root node of the new binary tree is the sum of the weights of the root node of its left and right sub-tree.
Remove the two trees from F and add the new two-fork tree in ascending order to the set F.
Repeat two and 32 steps until there is only one binary tree in the set F.

The simple understanding is that if I have a,b,c,d,e five characters, the frequency (that is, the weight) is 5,4,3,2,1, then we first take two minimum weights as the left and right sub-tree structure a new tree, that is, take 1, 2 constitute a new tree, its node is 1+2=3,

The dashed line is the newly generated node, and the second step then puts the newly generated node with the weight of 3 into the remaining set, so the set becomes {5,4,3,3}, and then, based on the second step, takes the minimum two weights to form a new tree,

Then establish Huffman tree, such as:

where each weight value replaces the corresponding character is:

So the corresponding encoding for each character is: a->11,b->10,c->00,d->011,e->010

Huffman encoding

The main purpose of Huffman in the study of this optimal binary tree is to solve the optimal problem of data transmission of long-distance communication (mainly telegraph) in the current year. For example, the transmission of a string of characters "Badcadfeed", using binary data representation, such as the following table:

Letters A B C D E F
Binary characters 000 001 010 011 100 101

After encoding the binary data stream is "001000011010000011101100100011", the other side receives the same according to 3 bit a set of decoding. Now assume that the 6 letters appear in different frequencies, A 27%,b%8,c 15%,d 15%,e 30%,f 5%. The following 27, 8, 15, 15, 30, 5 as the weights of a, B, C, D, E, F are constructed Huffman tree, such as:

Change the left branch of the right Huffman tree to 0 and the right branch to 1.

Now the 6 letters are encoded with 0 or one of the paths from the root node to the leaf, and the resulting encoding table is as follows:

Letters A B C D E F
Coding 01 1001 101 00 11 1000

Encode "Badcadfeed" again to "1001010010101001000111100", a total of 25 characters, saving approximately 17% of storage and transfer costs compared to the 30 characters previously encoded.

In decoding, the same Huffman tree is used, that is, the sender and receiver agree on the same Huffman coding rules. When the receiver receives "1001010010101001000111100", the Huffman tree in the right-hand image.

Code
#include <iostream>#include<string>using namespacestd;//node Typestructelement{DoubleWeight//the probability of a character appearing as a real number    Charch; intLchild, Rchild, parent;};//The two nodes I1 and i2 with the lowest weights in the Huffter are found .voidSelect (element hufftree[],int*a,int*b,intN) {    inti; DoubleWeight =0;  for(i =0; I <n; i++)    {        if(Hufftree[i].parent! =-1)//If there is a parent node, do not Judge .            Continue; Else        {            if(Weight = =0) {Weight=Hufftree[i].weight; *a =i; }            Else            {                if(Hufftree[i].weight <weight) {Weight=Hufftree[i].weight; *a =i; } }}} Weight=0;  for(i =0; I < n; i++)    {        if(Hufftree[i].parent! =-1|| (i = = *a))Continue; Else         {            if(Weight = =0) {Weight=Hufftree[i].weight; (bb=i; }            Else             {                if(Hufftree[i].weight <weight) {Weight=Hufftree[i].weight; *b =i; }            }        }    }    inttemp; if(Hufftree[*a].lchild < Hufftree[*b].lchild)//avoid confusion of the left and right sub-trees of root nodes{Temp= *A; *a = *b; *b =temp; }}//Building the Hoffmann treevoidHuffmantree (element hufftree[],intW[],CharCh[],intN) {     for(inti =0; I <2N1; i++)//Hoffman Tree has a total of 2*n-1 nodes.{hufftree[i].parent= -1;//parents ' knot.Hufftree[i].lchild =-1;//left child knot pointHufftree[i].rchild =-1;//right child knot point    }     for(inti =0; I < n; i++)//constructs n Two-ary tree with root nodes only.{hufftree[i].weight= W[i];//assigning weights to huffman treeshufftree[i].ch = Ch[i];//characters that need to be encoded    }     for(intk = n; K <2N1; k++)//n-1 times Merge    {        intI1 =0; intI2 =0; Select (Hufftree,&AMP;I1,&AMP;I2,K);//The two nodes I1 and i2 with the lowest weights in the Huffter are found .Hufftree[i1].parent = k;//merging I1 and I2, the parents of I1 and I2 are KHufftree[i2].parent =K; Hufftree[k].weight= Hufftree[i1].weight +Hufftree[i2].weight; Hufftree[k].lchild=I1; Hufftree[k].rchild=I2; }}//Hoffman CodevoidHuffmancode (element hufftree[],intN) {    intI, j,k; strings ="";  for(i =0; I < n; i++)//the first n elements in the array hufftree are leaf nodes and need to be encoded .{s="";//encode s initialize to empty stringj = i;//staging I, without breaking the loop variable         while(Hufftree[j].parent! =-1)//Node J has parents.{k=hufftree[j].parent; if(j = = Hufftree[k].lchild)//Node J is the left child of his parents.{s= S +"0"; }            Else                //Node J is the right child of his parents.{s= S +"1"; } J= Hufftree[j].parent;//Assign the parents of the node J to J.} cout<<"character"<"the encoding:"<<Endl;  for(intI =s.size ()-1; I >=0; i--)//reverse the encoding of S as a node I output{cout<<s.at (i) <<" "; } cout<<Endl; }}intMain () {Const intn =6; Element hufftree[2*N]; CharCh[] = {'a','b','C','D','e','F'}; intW[] = { -, -, Max, $, -, -}; //construct the Hoffmann treeHuffmantree (hufftree,w,ch,n); //CodingHuffmancode (hufftree,n); System ("Pause"); return 0;}

Huffman Tree and its application

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.