The principle and structure of the Harman tree (reprinted)
This is the process of constructing the Harman tree.
1. Create an initial set
{W1, W2, W3 ,..., wi ,..., wn} is the initial set of n Binary Trees. F = {T1, T2, T3 ,..., ti ,..., tn}, where each binary tree Ti has only one root node with the weight of Wi, and its left and right subtree are empty. (To facilitate Algorithm Implementation on the computer, it is generally required to sort the weights in ascending order of Ti Wi .)
2. Select left and right subtree
In F, the left and right Subtrees with the smallest root node weights are selected as the newly constructed binary tree, the weight of the root node of the new binary tree is the sum of the weight of the root node of the left and right subtree.
3. Delete left and right subtree
Delete these two shards from F and add the new binary tree to the set F in ascending order.
4. Repeat steps 2 and 3,
Repeat steps 2 and 3 until there is only one binary tree in set F.
For example
There is a sequence)
Ask you to ask for the Harman tree
Step 1: think of these points as a set of trees with only root nodes F
Step 2: select two trees with the smallest values
Step 3: Delete the two trees from the set F of these trees
Then construct a binary tree
Changed to (5 = 2 + 3)
Then add the tree to the set F.
5 indicates the weight of the tree.
Then proceed to the above steps
Select 5 and 6.
Construct these two Binary Trees
In F, delete 5 6 and add 11 to the tree.
Changed
Continue with the above steps
Select 7 and 9
Delete 7 and 9 in F
Add 16 to this tree
Changed
Continue with the above steps
Select 10 and 11
Delete 10 and 11 in F and add 21 to the tree.
Continue with the above steps
Select 16 and 21 (there are two 21, whichever is needed)
I chose 21 with only one root node.
16 and 21 constitute a binary tree
Delete the 16 and 21 shards in F.
Add 37 to the tree
Continue with the above steps
Select 21 and 32
Binary Tree
Delete the two shards 21 and 32 in F.
Add 53 to this tree
Continue with the above steps
Merge the two trees in F into a tree.
Finished!
C language code implementation:
/*-------------------------------------------------------------------------
* Name: source code encoded by Harman.
* Date: 2011.04.16
* Author: Jeffrey Hill + Jezze (Decoding part)
* Pass the test under Win-TC
* Implementation process: the user first constructs the User-Defined tree using the HuffmanTree () function, and then
* Starts from the bottom to the top (that is, starting from the node where the array serial number is zero ).
* Set the code to 0 on the left side of the parent node. If it is on the right side, set the code to 1. Finally, the generated encoding is output.
*------------------------------------------------------------------------*/
#include <stdio.h>
#include<stdlib.h>
#define MAXBIT 100
#define MAXVALUE 10000
#define MAXLEAF 30
#define MAXNODE MAXLEAF*2 -1
typedef struct
{
int bit[MAXBIT];
int start;
} HCodeType;/* encoding struct */
typedef struct
{
int weight;
int parent;
int lchild;
int rchild;
int value;
} HNodeType;/* node struct */
/* Construct a user-defined tree */
void HuffmanTree (HNodeType HuffNode[MAXNODE], int n)
{
/* I, j: cyclic variable, m1, m2: Construct the weights of the two least weight nodes in the different PROCESS OF THE Harman tree,
X1 and x2: Construct the sequence numbers of the two least weight nodes in the array during different processes of the Harman tree. */
int i, j, m1, m2, x1, x2;
/* Initialize and store the nodes in the HuffNode [] array of the Harman tree */
for (i=0; i<2*n-1; i++)
{
HuffNode [I]. weight = 0; // weight
HuffNode[i].parent =-1;
HuffNode[i].lchild =-1;
HuffNode[i].rchild =-1;
HuffNode [I]. value = I; // actual value, which can be replaced with letters as needed
} /* end for */
/* Enter the weight of n leaf nodes */
for (i=0; i<n; i++)
{
printf ("Please input weight of leaf node %d: \n", i);
scanf ("%d", &HuffNode[i].weight);
} /* end for */
/* Construct the Huffman tree cyclically */
for (i=0; i<n-1; i++)
{
M1 = m2 = MAXVALUE;/* m1 and m2 store two nodes with no parent node and the minimum node weight */
x1=x2=0;
/* Find the two nodes with the smallest weight and no parent node in all nodes and combine them into a binary tree */
for (j=0; j<n+i; j++)
{
if (HuffNode[j].weight < m1 && HuffNode[j].parent==-1)
{
m2=m1;
x2=x1;
m1=HuffNode[j].weight;
x1=j;
}
else if (HuffNode[j].weight < m2 && HuffNode[j].parent==-1)
{
m2=HuffNode[j].weight;
x2=j;
}
} /* end for */
/* Set the parent node information of the two child nodes x1 and x2 found */
HuffNode[x1].parent = n+i;
HuffNode[x2].parent = n+i;
HuffNode[n+i].weight = HuffNode[x1].weight + HuffNode[x2].weight;
HuffNode[n+i].lchild = x1;
HuffNode[n+i].rchild = x2;
Printf ("x1.weight and x2.weight in round % d: % d, % d \ n", I + 1, HuffNode [x1]. weight, HuffNode [x2]. weight);/* for testing */
printf ("\n");
} /* end for */
/* for(i=0;i<n+2;i++)
{
printf(" Parents:%d,lchild:%d,rchild:%d,value:%d,weight:%d\n",HuffNode[i].parent,HuffNode[i].lchild,HuffNode[i].rchild,HuffNode[i].value,HuffNode[i].weight);
} * // Test
} /* end HuffmanTree */
// Decoding
void decodeing(char string[],HNodeType Buf[],int Num)
{
int i,tmp=0,code[1024];
int m=2*Num-1;
char *nump;
char num[1024];
for(i=0;i<strlen(string);i++)
{
if(string[i]=='0')
num[i]=0;
else
num[i]=1;
}
i=0;
nump=&num[0];
while(nump<(&num[strlen(string)]))
{tmp=m-1;
while((Buf[tmp].lchild!=-1)&&(Buf[tmp].rchild!=-1))
{
if(*nump==0)
{
tmp=Buf[tmp].lchild ;
}
else tmp=Buf[tmp].rchild;
nump++;
}
printf("%d",Buf[tmp].value);
}
}
int main(void)
{
HNodeType HuffNode [MAXNODE];/* defines a node struct array */
HCodeType HuffCode [MAXLEAF], cd;/* defines an array of encoding structures and a temporary variable to store information for encoding */
int i, j, c, p, n;
char pp[100];
printf ("Please input n:\n");
scanf ("%d", &n);
HuffmanTree (HuffNode, n);
for (i=0; i < n; i++)
{
cd.start = n-1;
c = i;
p = HuffNode[c].parent;
While (p! =-1)/* parent node exists */
{
if (HuffNode[p].lchild == c)
cd.bit[cd.start] = 0;
else
cd.bit[cd.start] = 1;
Cd. start --;/* calculate the lower bit of encoding */
c=p;
P = HuffNode [c]. parent;/* set the next cycle condition */
} /* end while */
/* Save the start bits of the Harman encoding and encoding for each leaf node */
for (j=cd.start+1; j<n; j++)
{ HuffCode[i].bit[j] = cd.bit[j];}
HuffCode[i].start = cd.start;
} /* end for */
/* Output all the Haffman codes that have been saved */
for (i=0; i<n; i++)
{
printf ("%d 's Huffman code is: ", i);
for (j=HuffCode[i].start+1; j < n; j++)
{
printf ("%d", HuffCode[i].bit[j]);
}
printf(" start:%d",HuffCode[i].start);
printf ("\n");
}
/* for(i=0;i<n;i++){
for(j=0;j<n;j++)
{
printf ("%d", HuffCode[i].bit[j]);
}
printf("\n");
}*/
printf("Decoding?Please Enter code:\n");
scanf("%s",&pp);
decodeing(pp,HuffNode,n);
getch();
return 0;
}
Proof of Working Principles
In a general data structure book, the author will generally introduce the HUFFMAN tree and the coding behind the tree. User-Defined coding is an application of the User-Defined tree. He has a wide range of applications, such as JPEG.
First, we will introduce what is the husky tree. The best Binary Tree is a binary tree with the shortest length of the weight path. The length of the tree's weighted path is the length of the path from all the leaf nodes in the tree to the root node (if the root node is layer 0, the path length from the leaf node to the root node is the number of layers of the leaf node ). The length of the tree's weighted path is marked as WPL = (W1 * L1 + W2 * L2 + W3 * L3 +... + Wn * Ln), N weight Wi (I = 1, 2 ,... n) to form a binary tree with N leaf nodes. The path length of the corresponding leaf node is Li (I = ,... n ). It can be proved that the WPL of The Harman tree is the smallest.
When he proposed this encoding in the early 1950s s, he constructed the Minimum Average Length Encoding Based on the probability of occurrence of characters. It is a variable-length encoding. In encoding, if the length of each codeword is strictly in the reverse order of the probability of occurrence of the corresponding symbol, the average length of the Code is the smallest. (Note: the code word is the encoding obtained after the symbol is encoded by the user. Its length varies depending on the probability of the symbol. Therefore, the User-Defined code is a variable-length code .)
However, how can we construct a husky tree? The most common construction method is the Harman algorithm. The description can be found in a general data structure book:
1. For the given n weights {W1, W2, W3 ,..., wi ,..., wn} is the initial set of n Binary Trees. F = {T1, T2, T3 ,..., ti ,..., tn}, where each binary tree Ti has only one root node with the weight of Wi, and its left and right subtree are empty. (To facilitate Algorithm Implementation on the computer, it is generally required to sort the weights in ascending order of Ti Wi .)
2. In F, select the Left and Right Subtrees with the smallest root node weights as the newly constructed binary tree, the weight of the root node of the new binary tree is the sum of the weight of the root node of the left and right subtree.
3. Delete these two trees from F and add the new binary tree to the set F in ascending order.
4. Repeat steps 2 and 3 until there is only one binary tree in set F.
The above algorithm is implemented in C language and can be a static binary tree or a dynamic binary tree. If a dynamic binary tree is used, the following data structures can be used: struct tree {
Float weight;/* weight */
Union {
Char leaf;/* leaf node information character */
Struct tree * left;/* left node of the tree */
};
Struct tree * right;/* right node of the tree */
};
Struct forest {/* F set, expressed as a linked list */
Struct tree * ti;/* tree in F */
Struct forest * next;/* next node */
};
For example, if the probability of occurrence of letters A, B, Z, and C is: 0.75, 0.54, 0.28, 0.43, then the corresponding weight is:, 54.
After constructing the user tree, you can encode it based on the user tree. For example, the above character uses the probability of occurrence as the weight value to construct the corresponding code value produced by the Harman encoding. As long as you use the same user-defined tree, the encoding can be restored to the original group of characters. Obviously, a user-defined code is prefix encoding. That is, the encoding of any character is not the prefix of the encoding of another character. Otherwise, the encoding cannot be translated. For example, the encoding of a, B, c, d is: 1010,. For the encoding string:, it can be translated as bb or ca, because B encoding is the prefix of c encoding. Just now, the coding rule is to move the path from the root node to the leaf node (including the original information) to 0 for the left child and 1 for the right child, of course, you can turn around ...... remaining full text>
The construction process of the Harman tree
Harman tree:
Given n weights as n leaf nodes, construct a binary tree. If the length of the weighted path reaches the minimum, such a binary tree is called the optimal binary tree, it is also called the Huffman tree ). A tree with the shortest length of a weight path, a node with a large weight value is closer to the root.
The structure of the Harman tree:
Assume that the given weights are as follows: 3, 5, 7, 8, 10, 15;
First, take the minimum two numbers in the Set: 3 + 5 = 8, then delete the values of 3 and 5 in the Set, and put 8 into the original set,
The original set is 7, 8, 8, 10, 15;
8
/\
3 5
Then, from 7, 8, 10, 15, take two smallest numbers to form a tree.
15
/\
8 7
/\
3 5
Then, take two smallest numbers from 8, 10, 15, and 15 to form a tree:
18
/\
8 10
Take the two smallest numbers 15, 15, and 18: 15, 15 to form the tree:
30
/\
15 15
/\
8 7
/\
3 5
Finally, the S constitute a tree (at this time, there are no elements in the collection, and the Harman tree is formed ):
48
/\
30 18
/\/\
15 15 8 10
/\
8 7
/\
3 5
Hope you can understand it !!