Algorithm Series (eight) rle stroke length compression algorithm

Last Update:2017-02-28 Source: Internet

Author: User

Tags final repetition advantage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The RLE (Run length Encoding) stroke length compression algorithm (also known as the duration compression algorithm) is the earliest and simplest lossless data compression algorithm. The basic idea of the RLE algorithm is to divide the data into two cases according to the linear sequence: one is a continuous block of repeated data, the other is a continuous block of repeated data. For the first case, a continuous block of repeated data compression, compression method is a representation of the block number of attributes plus a block of data to represent the original continuous block of data. For the second case, RLE algorithm has two processing methods, one processing method is the same as the first case of continuous block of data, only the attribute of block number is always 1, another way is not to do any processing of data, directly to the raw data as compressed data.

For a more intuitive description of the RLE algorithm, the following sample data is used to demonstrate the RLE algorithm. First of all, the original data consists of 5 consecutive identical blocks of data:

[Block] [Block] [Block] [Block] [Block]

Then the compressed data is:

[5] [block]

Next, the original data is made up of contiguous blocks of data that are not repeated:

[Block1] [Block2] [Block3] [Block4] [BLOCK5]

According to the first processing method, the final compressed data is as follows:

[1] [Block1] [1] [Block2] [1] [Block3] [1] [Block4] [1] [BLOCK5]

If you follow the second processing method, the final data is the same as the original data:

[Block1] [Block2] [Block3] [Block4] [BLOCK5]

Block blocks can be of arbitrary length, the longer the length of the block, the lower the probability of continuous repetition, the advantage of compression is not reflected, therefore, the majority of RLE algorithms are implemented using one byte as the length of the data block.

Next, this article introduces several RLE algorithm implementation, first of all is the simplest algorithm realization, this kind of algorithm realizes to the continuous repeat byte to adopt and repeats the same processing method, is the increment one value is 1 the contiguous block number attribute (a byte) before each byte. The first advantage of this approach is that the compression and decompression algorithm is simple, you can use the same pattern to deal with two of cases of compressed data, the disadvantage is that when the original data repetition rate is low, the compressed data length will exceed the length of the original data, not to compress the role, The worst case scenario is that there is no continuous repetition of all blocks of data, and the compressed data expands by a single fold. The process of compression is such that the linear scan of the original data, if there is a duplicate byte behind one byte, increase the repeat count, and then continue to scan backwards until a repeat byte is found, then write the block number and the byte data to the compressed data, and then continue the scan from the new start byte until the original book data ends. The important thing to note in the algorithm is that the block attribute is stored in one byte, so the maximum value is 255, and when the contiguous data is more than 255 bytes, it is disconnected from the No. 255 byte, and the No. 256 byte and the data after 256 bytes are treated as new data. The C language of this RLE compression algorithm is implemented as follows:

6 int rle_encode_n (unsigned char *inbuf, int insize, unsigned char *outbuf, int onubufsize)   
       
 7 {   
       
 8     unsigned cha R *src = inbuf;   
       
 9     int i;   
       
Ten     int encsize = 0;   
       
One    
       
while     (src < (inbuf + insize))   
       
{2     if   
       
(         encsize +) > Onubufsize)/* The output buffer space is not enough. *             return-1;         unsigned char value = *src++;         i = 1;   
       
While         ((*src = = value) && (I < 255))             src++;             i++;         outbuf[encsize++] = i;         outbuf[encsize++] = value;     encsize return     ;   
       
30}

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More