Deflate Tree and Deflate coding

Source: Internet
Author: User

about the deflate tree, the data can be found very little, this concept from the gzip compression algorithm, is transformed by the Huffman tree. This is a simple record of the deflate tree generation process and the deflate encoding.

Suppose you build a Huffman tree with 5 8 9 10 14 15, which can look like this:

     61

            /            \           27              34        /       \       /         \      14        13     15         19            /   \            /   \           5     8           9    10

You can also swap two subtrees trees for any node.

     61

            /            \           34               27                                   /       \       /         \      15        19     14         13            /   \            /   \               9   10           5    8

交换的过程虽然会改变叶子结点的huffman编码,但是,不会改变huffman树的带权路径和,也不会改变每个叶子结点的编码长度。基于这一点,我们可以做个更特殊的变换,每一层,让非叶子结点排在右边,叶子结点排在非叶子结点的左边。上面这棵树的变换之后如下:

     61

            /            \           34               27                                   /       \       /         \      15        14     19         13                     /   \      /   \                     9   10     5    8After the transformation, the tree above is called the deflate tree. Similarly, although the deflate tree changes the Huffman encoding of the endpoint, it does not change the encoding length of each element. The semantics in gzip compression are not changing the compression rate. The above changes are not well understood in words, and one example illustrates this: Suppose the following is a Huffman tree:

       A

            /             \           B               C                                   /       \       /         \       D        E      F          G               /   \  /   \              G    H I    J           /   \                 K L             /  \M NAfter conversion to deflate, the following:

       A

            /             \           B               C                                   /       \       /         \       D        G      F          E                     /   \       /   \                    I    J      H    G                                 /   \                 K L                                   /  \M NSo what is the benefit of converting to a deflate tree? This involves the record of the Code table. The so-called code table is the element and its corresponding encoding. Take a look at the normal Huffman code table record or 5 8 9 10 14 15 as a collection, the following Huffman tree as an example:

     61

            /            \           34               27                                   /       \       /         \      15        19     14         13            /   \            /   \               9   10           5    8Assuming left is 0, walk right is 1, then the Code table is: 15 14 9 10 5 800 10 010 011 110 111 to be able to decode, we must record this code table. After looking at the converted to deflate tree, how to record the above tree after conversion is as follows:

     61

            /            \           34               27                                   /       \       /         \      15        14     19         13                     /   \      /   \                     9   10     5    8Let's say left is 0, walk right is 1. The encoding of the converted elements changes, the code table should be as follows:          14              9                800          5           &N Bsp        10                     101           $        111 Although the coding of the elements has changed, it doesn't matter, as long as we record the Code table, we can restore the data. Said before, although deflate changed the encoding, but each element of the encoding length is constant, at this time, you can only record the encoding length of each element, you can decode the data to restore. Now, the Code table so record, each layer, from left to right to record the length of the leaf node encoding, hierarchical press from top to bottom. First record the 2nd layer (the root node is the No. 0 layer) Two leaves, and then record the third time the 4 leaves, the Code table is as follows:          14             &N Bsp   9         10            5            8 2            2                 3           3 &nbsP           3            3 First forget how to decode according to this Code table, the first comparison of the two recording methods, you will find that the following code table records than the above Code table records save bits, 2 bits  ,  3 bits  , total bit 6*2=12. The total length of the upper encoding is 2+2+3+3+3+3=16 (15, 14 encoding length 2,9, 10, 5, 8 encoding length 3). This is not accidental, because the encoded length of an element (10 of the encoded length is 3) is the binary bit (10 of the encoded length 3, which accounts for the binary 2 bits) must be less than or equal to the length of the encoding (10 of the encoding length 3). This is the advantage of long record code, why so care about this little bit, you know, deflate tree is used for compression algorithm, and this is not complicated, why not?

Now again, with this code table how to decode, decoding is the inverse of the encoding process, so, first look at the deflate tree encoding

The deflate tree is encoded in the following way:

The leftmost leaf node of the nth layer is encoded = (encoding of the leftmost leaf node of the n-1 layer) + (the leaf node of the n-1 layer)) << 1.

Nth layer, followed by a leaf node encoding = previous leaf node encoding +1

Also take the following tree as an example:

     61

            /            \           34               27                                   /       \       /         \      15        14     19         13                     /   \      /   \                     9   10     5    815 encoding is 00 then 9 of the encoding = (last layer of the leftmost leaf node 15 encoding + leaf node on the previous Layer 2) <<1                 = &N Bsp (+ +) <<1                 =   &NBSP;10010 encoding = 9 encoding +1 = 1015  = 10 Encoding +1 = 1108 encoding  = 5 Encoding +1 = 111  Now can be said decoding process, the Code table first moved down:         &NBSP;14   &NBS P             9         10           &NBSP;5            8 2            2                 3           3             3 & nbsp         &NBSP;3 Since this code table is recorded in a way that leaves nodes from left to right, and hierarchy from top to bottom, it will be found that the encoding length is the level of the leaf (assuming the root node is layer No. 0). Therefore, the second layer began to appear the first leaf node, the first leaf node must have been left. So according to the encoding of code rule 15 is 00,14 encoding is the 01,9 encoding is (00+2) <<1 = 100...  This is deflate tree and deflate encoding. In fact, in gzip, the deflate tree's code table is not so recorded, but defThe idea of encoding and decoding late trees is like this. The code table above is the record element and its corresponding code length, but in gzip, in order to better compress the effect, do not record the element, but directly record the encoding length of the element, with a length sequence to represent the Code table. If you want to know its implementation, should go to see Gzip source code, gzip source is very exciting, geek thought everywhere, simply breathtaking.

Deflate tree and deflate encoding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.