Java implementation __.net of Huffman coding algorithm

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Directory (?) [+] Introduction to Huffman coding

Huffman encoding is the binary encoding pairing of characters and characters, which is divided into encoding and decoding to compress the binary data length of characters. We know that character storage and transmission are binary (the computer only knows 0/1), then there is the mapping relationship between the character and the binary. Characters belong to character sets (Charset), characters need to be encoded (encode) for binary storage and transmission, the display needs to decode (decode) back characters, character set and encoding method is a one-to-many relationship ( Unicode can be encoded with utf-8,utf-16, etc.). Understand the character set, coding and decoding, flying garbled problem also on the edge of the solution. For example, in the ASCII encoding, the decimal is 97 and the binary is 01100001. Each character in ASCII is encoded with 8 bit (1Byte), and if 1000 characters are to be transmitted, then 8,000 bit is transmitted. The problem is that the letter E is used in English with a frequency of 12.702%, and Z is 0.074%, the former is 100 times the latter, but it does use the same number of digits in the binary. Can do better, the method is variable length coding, the guiding principle is high frequency with a shorter number of digits encoded, low frequency with a longer digit code. Huffman coding algorithm is to deal with such problems. Huffman coded Java implementation

The data structure used mainly in Huffman coding algorithm isComplete binary trees (full binary)and priority queues. The latter is java.util.PriorityQueue, the former implementation (all internal classes), the code is as follows:[Java] View Plain Copy static class tree { private node root; public node getroot () { return root; } public void setroot (Node root) { this.root = root; } } static class Node implements Comparable<Node> { private String chars = ""; private int frequence = 0; private Node parent; private Node leftNode; private Node rightNode; @Override public int CompareTo (node n) { return frequence - n.frequence; } public boolean isleaf () { return Chars.length () == 1; } public boolean isroot () { return parent == null; } Public boolean isleftchild () { return parent != null && this == parent.leftnode; } public int getfrequence () { return frequence; &nBSP;} public void setfrequence (int frequence) { this.frequence = frequence; } public string getchars () { return chars; } public void setchars (string chars) { this.chars = chars; } public Node getparent () { return parent; } public void setparent (node parent) { this.parent = parent; } public node getleftnode () { return leftNode; } public void Setleftnode (node leftnode) { this.leftnode = leftnode; } public node getrightnode () { return rightNode; } public void setrightnode (node rightnode) { this.rightNode = rightNode; } } Statistical Data

Since you have to arrange the coding table by frequency, then of course you have to get statistical information about frequency. I have implemented a method to deal with such problems. If you already have statistical information, then turn to map<character,integer>. If the information you get is a percentage, multiply by 100 or 1000, or 10000. can always be converted to integers. For example, 12.702% times 1000 for 12702,huffman encoding is only concerned with size issues. Statistical methods are implemented as follows:[Java] View Plain copy public static map<character, integer> statistics (Char[] chararray) { map<character, integer> map = new HashMap<Character, Integer> (); for (Char c : chararray) { character character = new character (c); if (Map.containskey ( character)) { map.put (Character, map.get (character) + 1); } else { &nbSp; map.put (character, 1); } } return map; } Build Tree

The construction tree is the core step of Huffman coding algorithm. The idea is to hang all the characters to a singlecomplete two-fork treeLeaf node, the left node of any one non-page child node appears less frequently than the right node. The algorithm is used to store statistics into node in a priority queue. Each time a two-min node is ejected from the queue, a new parent node (non-leaf nodes) is created, the sum of the two-node characters that the character content has just popped out of, and the frequency is their sum, the first to bounce out as the Zoozi node, The latter is the right child node and the newly built parent node is placed in the queue. Repeat the action N-1 times, N is the number of different characters (the number of each queue minus 1). To complete the above steps, there is one node left in the queue, which pops up as the root node of the tree. The code is as follows:[Java] View Plain copy private static tree buildtree (Map<character, integer> statistics, List<Node> Leafs) { Character[] keys = Statistics.keyset (). ToArray (New character[0]); PriorityQueue<Node> priorityQueue = new PriorityQueue<Node> ( ); for (character character : Keys) { node node = new node (); node.chars = character.tostring (); &nbSp; node.frequence = statistics.get (character); priorityqueue.add (node); leafs.add (node); } int size = Priorityqueue.size (); for (int i = 1; i <= size - 1; i++) { node node1 = priorityqueue.poll (); Node node2 = Priorityqueue.poll (); Node sumNode = new node (); sumnode.chars = node1.chars + node2.chars; sumNode.frequence = node1.frequence + node2.frequence; sumnode.leftnode = node1; sumNode.rightNode = node2; node1.parent = sumNode; node2.parent = sumNode; priorityqueue.add (SumNode); } tree tree = new tree (); tree.root = priorityqueue.poll (); return tree; } Coding

A character corresponds to the encoding, from the leaf node where the character is searched up, if the character node is the left node of the parent node, before the encoding character plus 0, if it is the right node, plus 1, until the root node. As long as the mapping relationship between the character and the binary code is obtained, the encoding is very simple. The code is as follows:[Java] View Plain copy public static string encode (string originalstr, map<character, integer> statistics) { if (originalstr == null | | originalstr.equals ("")) { return ""; } char[] charArray = Originalstr.tochararray (); List<Node> Leafnodes = new arraylist<node> (); buildtree (statistics, leafnodes); map <character, string>&nbSp;encodinfo = buildencodinginfo (leafnodes); stringbuffer buffer = new stringbuffer (); for (Char c : chararray) { Character character = new Character (c); buffer.append (Encodinfo.get (character)); } return buffer.tostring (); } [Java] View Plain copy Private static map<character, string> buildencodinginfo (List<Node > leafnodes) { Map<Character, String> codewords = new hashmap<character, string> (); for (node leafnode : leafnodes) { character character = new character (Leafnode.getchars (). CharAt (0)); String codeword = ""; Node currentNode = leafNode; do { if (Currentnode.isleftchild ()) { codeword = "0" + codeword; } else { codeword = "1" + codeword; } currentNode = currentNode.parent; &n

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Java implementation __.net of Huffman coding algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Java implementation __.net of Huffman coding algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support