LZ77 algorithm principle and realization of __ algorithm

Source: Internet
Author: User
1. Introduction

LZ77 algorithm is a dictionary to do data compression algorithm, by Israel's two great God Abraham Lempel and Jacob Ziv published in 1977 paper "A Universal algorithm for sequential data Compression "In the paper.

Data compression coding based on statistics, such as Huffman encoding, requires a priori knowledge-the character frequency of the source, and then compresses. In most cases, however, this priori knowledge is difficult to obtain beforehand. Therefore, it is very important to design a more general data compression coding. The LZ77 data compression algorithm comes into being, its core idea is to use the repetitive structure information of data to compress data. Give a simple example, such as

Take it in righteousness, keep it in righteousness, and Zhou. Take it to cheat power, keep it to Bluff, Qin also.

Take it, and righteousness, and the people, and keep it, and the deceit. are repeated, and they can be expressed simply by pointing to where they appear before. To indicate the position, we define a relative position, as shown in

After the relative position of the message string for the use of fraud, guarding against the force, Qin also. , if the message string before the relative position can be matched, the encoding is the start and the end index of the message string that matches it, and the original character is encoded if it fails to match. The message string after the relative position can be encoded as: [(1-3), (Bluff), (6), (7-9), (Bluff), (12), (6), (Qin), (15-16)], as shown in the figure:

The above example shows how to use index values to represent words to achieve data compression purposes. The core idea of the LZ77 algorithm is the same, and its specific compression process is slightly more complicated than the above example. 2. Principle

This paper mainly discusses the LZ77 algorithm to do compression and decompression, on the LZ77 algorithm, the only translatable, lossless compression (that is, decompression can not be lost to restore information) of the nature of its mathematical proof refer to the original paper [1]. sliding window

As to how to describe the repetitive structure information, the LZ77 algorithm gives a more exact mathematical explanation. First, the length of the definition string S is L (s) L (s), where S (1,j), 1≤j≤l (s) s (1,j), 1≤j≤l (s) is the prefix of s S. for S (1,j) s (1,j) and I≤j i≤j, L (i) L (i) is the maximum value of L L (l≤l (s) −j l≤l (s) −j) that satisfies the following conditions:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.