(1) Topic content and Requirements:
Huffman Tree and its application
Design Purpose:
1. Familiar with the various storage structure and characteristics of the tree.
2. Master the method of establishing Haffman coding and the calculation of weighted path length.
Design content:
Want to send a content for aabbcab ... (a total of 100 characters, of which: A, B, C, D, E, F, respectively, 7, 9, 12, 22, 23, 27) of the Telegraph message, to achieve Huffman coding.
Design Requirements:
1. Analyze system requirements.
2. Set up Huffman tree.
3. Carry on Huffman coding, and find out the average coding length.
4. Programming to achieve 2, 3 steps.
(2) Schematic design:
1. Structural Huffman Tree
2. Implement Huffman coding
(3) Detailed design:
The problem is to calculate Huffman code, the use of Huffman coding is to send the message as little as possible, such as the occurrence of more times the characters with the shortest possible encoding, to achieve Huffman coding first to construct Huffman tree, Construction Huffman Tree method:
Assuming that there are n weights, the Huffman tree is constructed with n nodes. N weights are set to W1, W2 、...、 wn, Huffman tree construction rules are:
1. Will W1, W2 、..., wn as a forest with n trees (each tree has only one node);
2. In the forest, the two trees with the smallest weight value are merged, as the left and right subtrees of a new tree, and the root node weights of the new tree are the sum of their left and right Zishugen node weights;
3. Remove the two selected trees from the forest and add the new tree to the forest;
4. Repeat (02), (03) step, until only one tree is left in the forest, the tree is the Huffman tree
Storage Huffman Tree: Huffman Tree does not have a degree of 1 nodes, so a lesson has n nodes of the Huffman tree has a total of 2n-1 nodes, so can be stored in a 2n-1 size of a one-dimensional array, define a structure, specific expression as follows:
typedef struct//tree Node definition
{
int weight;//Weight Value
int parent;//Parent Node
int lchild;//Left Child
int rchild;//Right Child
}htnode,*huffmantree;
First initialize the leaf nodes and non-leaf nodes, so that after the construction Huffman tree, with the Select function to find the smallest and second small two nodes, I named its parent node, weight value of two Add.
Huffman Code:
Storage: Storing with a two-level pointer of a char type
typedef char **HUFFMANCODE;//Havermann Encoded, char two-level pointer
Find Huffman Code:
leaf node to the node with a ball each character Huffman code, the root node of all the right branch of the order is 1, the left branch order is 0, when the Huffman encoding should be reverse from each leaf node start talking about the parents back to the root node, each step to get an encoded value of 0 or 1, you can use a one-dimensional array to store the code, The defined start is the mark start position.
Here's a cd[n-1]= ' 0 ' to get the last one to be ' 0 ', to indicate the end. When traversing the left branch, another variable counts the length of the encoding, and the last left and right branches add up to the total length of the code.
(4) Source code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct//tree Node definition
{
int weight;
int parent;
int lchild;
int rchild;
}htnode,*huffmantree;
Havermann code, char two-level pointer
typedef char **huffmancode;
The static char n[100];//is used to save the body
The minimum weight node and Chicao node of the typedef struct//package
{
int S1;
int S2;
}mincode;
int count1=0,count2=0,sum=0;//statistic encoding length
Average encoding length of float avg=0;//
Constructs Huffman tree HT, the code is stored in HC, W is the weight, n is the number of nodes.
Huffmancode huffmancoding (huffmantree ht,huffmancode hc,int *w,int N)
{
int i,s1=0,s2=0;
Huffmantree p;
Char *CD;
int f,c,start,m;
Mincode min;
m=2*n-1;//Havermann Coding needs to open up the node size of 2n-1
ht= (Huffmantree) malloc (m+1) *sizeof (Htnode));//Open Huffman tree node space m+1. In order to correspond to the relationship, our No. 0 space is not.
Initialize n leaf nodes, w[0] = 0,main function assigned
for (p=ht,i=0;i<=n;i++,p++,w++)
{
p->weight=*w;
p->parent=0;
p->lchild=0;
p->rchild=0;
}
The initialization of a n-1 node.
for (; i<=m;i++,p++)
{
p->weight=0;
p->parent=0;
p->lchild=0;
p->rchild=0;
}
Structural Huffman Tree
for (i=n+1;i<=m;i++)
{
Min=select (ht,i-1); Find the smallest and second small two nodes.
S1=MIN.S1; Minimum node subscript
s2=min.s2;//Sub-point subscript
ht[s1].parent=i;
ht[s2].parent=i;
HT[I].LCHILD=S1;
HT[I].RCHILD=S2;
ht[i].weight=ht[s1].weight+ht[s2].weight;//Empowerment and
}
Print Haverman
printf ("Huffman tree below: \ n");
printf ("number\tweight\tparent\tlchild\trchild\n");
for (i=1;i<=m;i++)
{
printf ("%d\t%d\t%d\t%d\t%d\t\n", i,ht[i].weight,ht[i].parent,ht[i].lchild,ht[i].rchild);
}
Havermann encoding for each character from leaf node to root node
Hc= (Huffmancode) malloc ((n+1) *sizeof (char *));
cd= (char *) malloc (n*sizeof (char *)),//For Havermann encoding dynamic allocation space
Cd[n-1]= ' ";"//such as: 3 nodes encoded up to 2. Cd[3-1] = ' the ';
for (i=1;i<=n;i++)//To find the Havermann code of the leaf node
{
start=n-1;//starts at the bottom node and starts with the start node.
Start at the bottom of node 1th to encode the top (in reverse order), then code node 2nd, number 3rd ...
for (c=i,f=ht[i].parent; f!=0; c=f,f=ht[f].parent)
{
if (ht[f].lchild==c)
{
cd[--start]= ' 0 '; Defines a left subtree of 0, and a right subtree of 1
count1++;
}
Else
{
cd[--start]= ' 1 ';
count2++;
}
}
hc[i]= (char *) malloc ((N-start) *sizeof (char *));//Allocate encoding space for the first character
strcpy (Hc[i],&cd[start]); Copies the Havermann encoding of the current node to the HC.
}
Free (CD);
return HC;
}
Mincode Select (huffmantree ht,int N)
{
int min,secmin;
int temp = 0;
int i,s1,s2,tempi = 0;
Mincode Code;
S1=1;
S2=1;
min = 10000;//big Enough
for (i=1;i<=n;i++)//Find the smallest node of the weight weight, and the subscript is saved in the S1.
{
if (ht[i].weight<min && ht[i].parent==0)
{
Min=ht[i].weight;
S1=i;
}
}
Secmin = 10000;//big Enough
for (i=1;i<=n;i++)///Find the weight of the weights, the subscript is saved in the S2.
{
if ((ht[i].weight<secmin) && (I!=S1) && ht[i].parent==0)
{
Secmin=ht[i].weight;
S2=i;
}
}
CODE.S1=S1; Put into the package
CODE.S2=S2;
return code;
}
void Main ()
{
Huffmantree Ht=null;
Huffmancode Hc=null;
int *w=null;
int i,n;
Char tran[100];
printf ("Input node:");
Gets (N);
int Len=strlen (N);
Fflush (stdin);
n = strlen (n);
w= (int *) malloc ((n+1) *sizeof (int *));//open n+1 length int pointer space
w[0]=0;
printf ("Enter the right value of the node: \ n");
Enter the value of the node weight.
for (i=1;i<=n;i++)
{
printf ("w[%d]=", I);
scanf ("%d", &w[i]);
}
Fflush (stdin);
Constructs Huffman tree HT, the code is stored in HC, W is the weight, n is the number of nodes.
Hc=huffmancoding (Ht,hc,w,n);
Output Havermann encoding
printf ("Huffman Code is: \ n");
printf ("number\tweight\tcode\n");
for (i=1;i<=n;i++)
{
printf ("%c\t%d\t%s\n", N[i-1],w[i],hc[i]);
}
Fflush (stdin);
Sum=count1+count2;
printf ("Node length:%d", Len);
printf ("\ n Code total length:%d", sum);
Avg=1.0*sum/len;
printf ("\ n average encoded length:%f", avg);
Return
}
(3) Operation result and analysis:
(6) Harvest and experience:
1. Through this practice, let me huffman tree and Huffman tree application has a more profound understanding, also encountered a number of problems, such as the storage Huffman code to use a char type two pointer, and in the encoding of the end of the problem, should be the last character of the latter one as ' 0 ', is c[n-1]= ' 0 ', when the Huffman code should be from each leaf node to start talking about the parents back to the root node, each step to get a coded value of 0 or 1, you can use a one-dimensional array of storage code, and the Select function is written debugging many times, The S1 and S2 are defined to store the smallest and lesser values respectively, if this node is smaller than the minimum and does not have a parent node, that is, the smallest node, the lookup of the secondary decimal node satisfies three conditions, first cannot have the parent node, and cannot be equal to the number of the smallest node found, and then returns the smallest node and the minor value node. Many variables are defined in the topic, and sometimes they are mixed. So to understand the meaning of each variable, do the feelings of the topic must pay attention to the details, it is likely that a small place of the error will lead to the entire code can not run out, such as the definition of Huffman tree when adding a * number, and so on, to avoid making some unnecessary mistakes, So you must be careful.