The Huffman Tree of C language data structure and its application

Source: Internet
Author: User
Tags stdin strlen

(1) Topic content and Requirements:

Huffman Tree and its application
Design Purpose:

1. Familiar with the various storage structure and characteristics of the tree.

2. Master the method of establishing Haffman coding and the calculation of weighted path length.

Design content:
Want to send a content for aabbcab ... (a total of 100 characters, of which: A, B, C, D, E, F, respectively, 7, 9, 12, 22, 23, 27) of the Telegraph message, to achieve Huffman coding.
Design Requirements:

1. Analyze system requirements.

2. Set up Huffman tree.

3. Carry on Huffman coding, and find out the average coding length.

4. Programming to achieve 2, 3 steps.

(2) Schematic design:

1. Structural Huffman Tree

2. Implement Huffman coding

(3) Detailed design:

The problem is to calculate Huffman code, the use of Huffman coding is to send the message as little as possible, such as the occurrence of more times the characters with the shortest possible encoding, to achieve Huffman coding first to construct Huffman tree, Construction Huffman Tree method:

Assuming that there are n weights, the Huffman tree is constructed with n nodes. N weights are set to W1, W2 、...、 wn, Huffman tree construction rules are:

1. Will W1, W2 、..., wn as a forest with n trees (each tree has only one node);

2. In the forest, the two trees with the smallest weight value are merged, as the left and right subtrees of a new tree, and the root node weights of the new tree are the sum of their left and right Zishugen node weights;

3. Remove the two selected trees from the forest and add the new tree to the forest;

4. Repeat (02), (03) step, until only one tree is left in the forest, the tree is the Huffman tree

Storage Huffman Tree: Huffman Tree does not have a degree of 1 nodes, so a lesson has n nodes of the Huffman tree has a total of 2n-1 nodes, so can be stored in a 2n-1 size of a one-dimensional array, define a structure, specific expression as follows:

typedef struct//tree Node definition

{

int weight;//Weight Value

int parent;//Parent Node

int lchild;//Left Child

int rchild;//Right Child

}htnode,*huffmantree;

First initialize the leaf nodes and non-leaf nodes, so that after the construction Huffman tree, with the Select function to find the smallest and second small two nodes, I named its parent node, weight value of two Add.

Huffman Code:

Storage: Storing with a two-level pointer of a char type

typedef char **HUFFMANCODE;//Havermann Encoded, char two-level pointer

Find Huffman Code:

leaf node to the node with a ball each character Huffman code, the root node of all the right branch of the order is 1, the left branch order is 0, when the Huffman encoding should be reverse from each leaf node start talking about the parents back to the root node, each step to get an encoded value of 0 or 1, you can use a one-dimensional array to store the code, The defined start is the mark start position.

Here's a cd[n-1]= ' 0 ' to get the last one to be ' 0 ', to indicate the end. When traversing the left branch, another variable counts the length of the encoding, and the last left and right branches add up to the total length of the code.

(4) Source code:

#include <stdio.h>

#include <string.h>

#include <stdlib.h>

typedef struct//tree Node definition

{

int weight;

int parent;

int lchild;

int rchild;

}htnode,*huffmantree;

Havermann code, char two-level pointer

typedef char **huffmancode;

The static char n[100];//is used to save the body

The minimum weight node and Chicao node of the typedef struct//package

{

int S1;

int S2;

}mincode;

int count1=0,count2=0,sum=0;//statistic encoding length

Average encoding length of float avg=0;//

Constructs Huffman tree HT, the code is stored in HC, W is the weight, n is the number of nodes.

Huffmancode huffmancoding (huffmantree ht,huffmancode hc,int *w,int N)

{

int i,s1=0,s2=0;

Huffmantree p;

Char *CD;

int f,c,start,m;

Mincode min;

m=2*n-1;//Havermann Coding needs to open up the node size of 2n-1

ht= (Huffmantree) malloc (m+1) *sizeof (Htnode));//Open Huffman tree node space m+1. In order to correspond to the relationship, our No. 0 space is not.

Initialize n leaf nodes, w[0] = 0,main function assigned

for (p=ht,i=0;i<=n;i++,p++,w++)

{

p->weight=*w;

p->parent=0;

p->lchild=0;

p->rchild=0;

}

The initialization of a n-1 node.

for (; i<=m;i++,p++)

{

p->weight=0;

p->parent=0;

p->lchild=0;

p->rchild=0;

}

Structural Huffman Tree

for (i=n+1;i<=m;i++)

{

Min=select (ht,i-1); Find the smallest and second small two nodes.

S1=MIN.S1; Minimum node subscript

s2=min.s2;//Sub-point subscript

ht[s1].parent=i;

ht[s2].parent=i;

HT[I].LCHILD=S1;

HT[I].RCHILD=S2;

ht[i].weight=ht[s1].weight+ht[s2].weight;//Empowerment and

}

Print Haverman

printf ("Huffman tree below: \ n");

printf ("number\tweight\tparent\tlchild\trchild\n");

for (i=1;i<=m;i++)

{

printf ("%d\t%d\t%d\t%d\t%d\t\n", i,ht[i].weight,ht[i].parent,ht[i].lchild,ht[i].rchild);

}

Havermann encoding for each character from leaf node to root node

Hc= (Huffmancode) malloc ((n+1) *sizeof (char *));

cd= (char *) malloc (n*sizeof (char *)),//For Havermann encoding dynamic allocation space

Cd[n-1]= ' ";"//such as: 3 nodes encoded up to 2. Cd[3-1] = ' the ';

for (i=1;i<=n;i++)//To find the Havermann code of the leaf node

{

start=n-1;//starts at the bottom node and starts with the start node.

Start at the bottom of node 1th to encode the top (in reverse order), then code node 2nd, number 3rd ...

for (c=i,f=ht[i].parent; f!=0; c=f,f=ht[f].parent)

{

if (ht[f].lchild==c)

{

cd[--start]= ' 0 '; Defines a left subtree of 0, and a right subtree of 1

count1++;

}

Else

{

cd[--start]= ' 1 ';

count2++;

}

}

hc[i]= (char *) malloc ((N-start) *sizeof (char *));//Allocate encoding space for the first character

strcpy (Hc[i],&cd[start]); Copies the Havermann encoding of the current node to the HC.

}

Free (CD);

return HC;

}

Mincode Select (huffmantree ht,int N)

{

int min,secmin;

int temp = 0;

int i,s1,s2,tempi = 0;

Mincode Code;

S1=1;

S2=1;

min = 10000;//big Enough

for (i=1;i<=n;i++)//Find the smallest node of the weight weight, and the subscript is saved in the S1.

{

if (ht[i].weight<min && ht[i].parent==0)

{

Min=ht[i].weight;

S1=i;

}

}

Secmin = 10000;//big Enough

for (i=1;i<=n;i++)///Find the weight of the weights, the subscript is saved in the S2.

{

if ((ht[i].weight<secmin) && (I!=S1) && ht[i].parent==0)

{

Secmin=ht[i].weight;

S2=i;

}

}

CODE.S1=S1; Put into the package

CODE.S2=S2;

return code;

}

void Main ()

{

Huffmantree Ht=null;

Huffmancode Hc=null;

int *w=null;

int i,n;

Char tran[100];

printf ("Input node:");

Gets (N);

int Len=strlen (N);

Fflush (stdin);

n = strlen (n);

w= (int *) malloc ((n+1) *sizeof (int *));//open n+1 length int pointer space

w[0]=0;

printf ("Enter the right value of the node: \ n");

Enter the value of the node weight.

for (i=1;i<=n;i++)

{

printf ("w[%d]=", I);

scanf ("%d", &w[i]);

}

Fflush (stdin);

Constructs Huffman tree HT, the code is stored in HC, W is the weight, n is the number of nodes.

Hc=huffmancoding (Ht,hc,w,n);

Output Havermann encoding

printf ("Huffman Code is: \ n");

printf ("number\tweight\tcode\n");

for (i=1;i<=n;i++)

{

printf ("%c\t%d\t%s\n", N[i-1],w[i],hc[i]);

}

Fflush (stdin);

Sum=count1+count2;

printf ("Node length:%d", Len);

printf ("\ n Code total length:%d", sum);

Avg=1.0*sum/len;

printf ("\ n average encoded length:%f", avg);

Return

}

(3) Operation result and analysis:

(6) Harvest and experience:

1. Through this practice, let me huffman tree and Huffman tree application has a more profound understanding, also encountered a number of problems, such as the storage Huffman code to use a char type two pointer, and in the encoding of the end of the problem, should be the last character of the latter one as ' 0 ', is c[n-1]= ' 0 ', when the Huffman code should be from each leaf node to start talking about the parents back to the root node, each step to get a coded value of 0 or 1, you can use a one-dimensional array of storage code, and the Select function is written debugging many times, The S1 and S2 are defined to store the smallest and lesser values respectively, if this node is smaller than the minimum and does not have a parent node, that is, the smallest node, the lookup of the secondary decimal node satisfies three conditions, first cannot have the parent node, and cannot be equal to the number of the smallest node found, and then returns the smallest node and the minor value node. Many variables are defined in the topic, and sometimes they are mixed. So to understand the meaning of each variable, do the feelings of the topic must pay attention to the details, it is likely that a small place of the error will lead to the entire code can not run out, such as the definition of Huffman tree when adding a * number, and so on, to avoid making some unnecessary mistakes, So you must be careful.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.