Analysis of common data lossless compression algorithms

Source: Internet
Author: User

Introduction

Nowadays, the data of all kinds of information system is more and more large, how to transfer and store data faster and more is the primary problem of data processing, and data compressing technology is an important method to solve this problem. In fact, data compression technology has long been applied in various fields, from the compression software winrar to the familiar MP3.

2 Data compression Technology Overview

In essence, the data is compressed because the data itself is redundant. Data compression is the use of various algorithms to compress data redundancy to the minimum, and reduce distortion as much as possible, so as to improve transmission efficiency and save storage space.

Data compression technology is generally divided into lossy compression and lossless compression. Lossless compression is the refactoring of compressed data (restore, decompression), while refactoring data is exactly the same as the original data. This method is used for the compression of the image data (such as fingerprint image, medical image, etc.) which require the reconstructed signal to be identical with the original signal, such as the text data, the program and the special application situation. This kind of algorithm compression rate is low, generally is 1/2~1/5. Typical lossless compression algorithms include: Shanno-fano encoding, Huffman (Huffman) coding, arithmetic coding, run-length coding, LZW coding, etc. The lossy compression is reconstructed using the compressed data, the reconstructed data is different from the original data, but it does not affect the original data expression, but the compression rate is much larger. Lossy compression is widely used in speech, image and video data compression. Commonly used lossy compression algorithms are PCM (pulse code modulation), predictive coding, Transform coding (discrete cosine transform, wavelet transform, etc.), interpolation and extrapolation (spatial sub sampling, time domain sub sampling, adaptive) and so on. Most of the new generation of data compression algorithms adopt lossy compression, such as Vector quantization, sub band coding, model-based compression, fractal compression and wavelet compression.

3 Common data lossless compression algorithm

3. 1 run-Length code

This kind of data compression idea: If the data item d in the input stream successive occurrences n times, then a single character to ND to replace the consecutive N-time data items, which n consecutive occurrences of data items called Run N, this data compression method called run-length encoding (RLE), its implementation process as shown in Figure 1. The RLE algorithm has the advantages of simple implementation and fast compression and reduction, so it can compress the data by scanning the original data only once. Its shortcoming is inflexible, the adaptability is bad, different file format compression rate fluctuation is big, the average compression rate is low. The practice shows that RLE can compress the original lattice image with less complexity.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.