To put it simply, RLE compression is to convert a series of continuous identical data into a specific format for compression purposes.
The following sections compress byte streams.
Such as input data
Lpbte pbyte = {1, 1, 1, 1 };
The compressed data is 6, 1
4 characters are compressed.
However, this cannot be directly replaced in the data stream, but special control characters should be used; otherwise, the data cannot be extracted.
For example, pbyte = {6, 1, 0, 1, 1, 1, 1 };
In this case, two 6, 1 cannot determine whether the original 6, 1 is compressed by {,} or {,}.Code.
Therefore, there should be control characters.
(1)
To achieve the maximum compression rate, you can first scan the source data stream and use the minimum number of characters that appear as the control characters.
Such as pbyte = {6, 1, 1 ,...};
After scanning, 0 is the minimum character.
We use 0 as the compression control, and other characters represent itself. 0 in the source data, expressed as 0, 0.
After pbyte is compressed
6, 1, 1 ......
Decompress byte A, B, C;
A = scan the compressed data sequentially. If the input characters are not controlled characters, the data is directly output to the decompressed stream.
If it is a control character, B = whether the next character is also a control character. If yes, the code of the control character is output in the output stream.
If it is not c = read the compressed stream, then B C is output to the output stream.
Note: You must calculate the offset for the> ctrlcode encoding.
For example, CTRL = 2. When n = 3, it should be corrected to 2.
The method just introduced is the maximum compression rate, but the speed is not fast because each input character needs to be checked.
(2)
Other encoding methods can be used to increase the decompression speed.
The main method is not to check each input character. The compression ratio is almost the same when only a few checks are performed.
Let's take a look at the improvement method.
After careful observation, the non-repeated characters can also be expressed by controlling N + data. Here N has n uncompressed data in the table.
Or the data just now.
Pbyte = {6, 1, 0, 1, 1, 1, 1}
Select 0 for control without scanning
Compressed to 3, {6, 1, 0,} 0, 6, 1
N ctrl n m
It is very convenient to decompress the package.
Scan data to read one character,
{
N = read;
If (N)
{
Copy n characters
}
Else
{
N = read ();
M = read;
Write (n mb );
}
}
(3) Optimization
(1.
It is observed that the data compression ratio of and 1 is 0,
Therefore, n <= 3 does not need to be compressed.
The format is 1, 1.
In addition, if multiple consecutive control characters exist. It can also be compressed.
Observe CTRL = 0;
0, 0, 0
If the encoding is 8 0
The compression encoding is 0, 4, and 0, so the control character can be compressed after two consecutive characters.
Pair (2 ):
Only compression encoding is optimized.
Example
1, 2, 3, 4, 1, 1
If the formula is left empty
4, 1, 2, 3, 4, 0, 2, 1
It adds two bytes.
If you use
6, 1, 2, 3, 4, 1, and 1 add only one byte.