Floating-point data lossy compression algorithm with full C code

Last Update:2018-05-03 Source: Internet

Author: User

Tags uncompress

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A few years ago, when I was doing the mapping app algorithm,

Once thought about compressing the 3D Lut preset data,

Primarily used to enhance the user experience.

About the 3d Lut algorithm open source resources are also quite a lot of, do not do more science.

Interested friends, can go to the next FFMEPG project related implementation code.

The first contact 3d LUT algorithm is the 2014 reverse VSCO Cam film algorithm,

Of course, at first I didn't know that the algorithm was 3d Lut,

is to write each version repeatedly, algorithm optimization,

Until one day I suddenly remembered that a constant was particularly strange,

Then there was a period of time when looking at the 3d Lut algorithm data, I think the algorithm is particularly familiar.

And then naturally I knew what was happening.

When you were making an app that considered compressing the preset resources,

Due to the haste of the project, the LZ compression algorithm was used, and the natural compression ratio was not high.

resulting in a large preset file, a bit of resource volume.

Originally envisioned to do a floating point type of compression algorithm, this drag, there is no later article.

Many people are curious about how the film filter algorithm is implemented,

Many versions have been circulated online, and as a senior security researcher, I would like to tell you about the situation.

The early days app uses 2d Lut to simulate the effects of VSCO cam,

The idea is relatively simple, is to do a 2D color mapping table interpolation implementation, is generally 512*512*3 color table,

Gpuimage inside has the concrete realization, the interest can go to look under, here also does not unfold.

My side still retains the original VSCO Cam film algorithm.

In recent years, deep learning neural network fire, the author in the mobile phone side of the forward transmission,

Once again, a similar problem has been encountered.

Model quantization, model compression, and so on.

The idea of model quantification is actually quite simple, such as 32-bit quantization to 16-bit,

or quantify to 8-bit, achieve some performance improvement and resource compression by reducing the precision.

By quantifying an operation, you can both improve performance and compress the model volume,

So it must be a good plan.

Of course, in iOS you can also consider the use of memory-mapped mode,

Mapping physical space to memory space to reduce memory consumption and the like.

Of course this way must be supported by the operating system's file type.

There is no doubt that the compression of floating-point data is again encountered.

Most deep learning models now use 32-bit floating-point to store weights.

But strangely enough, it seems that there is no compression processing for floating-point data.

Hard to find floating-point data really can't compress it?!

Otherwise, before doing the image algorithm, I am particularly interested in the frequency domain algorithm,

Because this conversion thinking angle of processing, is indeed ingenious.

Read a lot of information on the Internet, feel that few people can use the popular language to explain the frequency domain.

I've been a big talker.

In fact, frequency domain core is frequency, that is, to meet a certain frequency law.

A bit like the counting method, for example: Eight 8 can be recorded as 88888888 can also be recorded as 88, can also be directly recorded as 8.

This is the frequency, and what is the frequency domain?

Frequency domain or frequency, in fact, is a description of the specific frequency wave rate probability and even frequency domain expression.

In other words, the count is based on a specific standard of expression.

The Fourier transform, or cosine transformation, here is actually a means of expression.

A bit like, you and your girlfriend agreed a signal, such as a wink, to express that, my dear, you understand.

Well, explain the temporary, down on a bit of children are not suitable.

The most classic compression algorithm, JPEG, this format has become a household name.

Although there are also rising stars WebP Flif and other formats,

But jpeg, like MP3, became the default tag for an era.

JPEG is a compression algorithm based on dct8x8 transform.

Specifically also does not expand, is interested can go to see JPEG codec correlation.

Example: Https://github.com/cpuimage/TinyJPEG

This foreshadowing is a bit long, so is it possible to do lossy compression on floating-point data based on DCT 8x8?

The answer, yes, is so simple and rude.

Data length: 8*8*8

Populate the data from 0-511 in order.

Here is a reference to the data:

Zlib compression:
MINIZ.C version:10.0.2
Compressed from 2048 to 730 bytes
Decompressed from 730 to 2048 bytes

dct+ zlib Compression:
MINIZ.C version:10.0.2
Compressed from 2048 to bytes
decompressed from 2048 bytes

If a certain DCT law is met, the compression ratio of dct+zlib is very high.

If the data being populated is random data with no regularity, in most cases zlib compression is higher than some.

Another technique in JPEG coding is the use of color space,

RGB is converted to YCBCR space to obtain a higher compression ratio.

Of course, there will be a certain loss of information, as if to say a little more.

Stop it.

Enclose the complete sample code:

#ifdef __cplusplusextern "C" {#endif#include<stdio.h>#include<stdlib.h>#include<stdint.h>#include"miniz.h"#include"dct.h"#include<stdint.h>intTest_miniz (ConstUnsignedChar*s_pstr, ULong Data_len) {    intCmp_status; ULong Src_len=Data_len; ULong Cmp_len=Compressbound (Src_len); ULong Uncomp_len=Src_len; uint8_t*PCMP, *Puncomp; printf ("miniz.c version:%s\n", mz_version); //Allocate buffers to hold compressed and uncompressed data.PCMP = (Mz_uint8 *)malloc((size_t) cmp_len); Puncomp= (Mz_uint8 *)malloc((size_t) src_len); if((!pcmp) | | (!Puncomp)) {printf ("Out of memory!\n"); returnexit_failure; }    //Compress the string.Cmp_status = Compress (pcmp, &cmp_len, (ConstUnsignedChar*) S_pstr, Src_len); if(Cmp_status! =Z_OK) {printf ("compress () failed!\n");  Free(PCMP);  Free(Puncomp); returnexit_failure; } printf ("compressed from%u to%u bytes\n", (Mz_uint32) Src_len, (Mz_uint32) cmp_len); //Decompress.Cmp_status = Uncompress (Puncomp, &Uncomp_len, pcmp, Cmp_len); if(Cmp_status! =Z_OK) {printf ("uncompress failed!\n");  Free(PCMP);  Free(Puncomp); returnexit_failure; } printf ("decompressed from%u to%u bytes\n", (Mz_uint32) Cmp_len, (Mz_uint32) uncomp_len); //ensure uncompress () returned the expected data.    if((Uncomp_len! = Src_len) | |(memcmp (Puncomp, S_pstr, (size_t) src_len))) {printf ("Decompression failed!\n");  Free(PCMP);  Free(Puncomp); returnexit_failure; }     Free(PCMP);  Free(Puncomp); printf ("success.\n"); returnexit_success;}intTest_dct_miniz (float*data, ULong len) {ULong ncount= Len/ -; float*in_data =data;  for(inti =0; i < ncount; i++) {DCT (in_data, in_data); In_data+= -; } Test_miniz ((ConstUnsignedChar*) data, Len *sizeof(float)); float*out_data =data;  for(inti =0; i < ncount; i++) {IDCT (out_data, out_data); Out_data+= -; }}intMainintargcChar*argv[]) {printf ("float data loss compression algorithm base DCT 8x8.\n"); printf ("DCT implementation by Thomas G. lane.\n"); printf ("Miniz implementation by Rich geldreich.\n"); //http://developer.download.nvidia.com/SDK/9.5/Samples/vidimaging_samples.html#gpgpu_dctprintf"blog:http://cpuimage.cnblogs.com/\n"); intIs_debug_output =1; ConstULong Data_len =8*8*8;//blocksize    floatTest_for_miniz[data_len]; floatTest_for_dct[data_len];  for(inti =0; i < Data_len; ++i) {Test_for_miniz[i]=i; } memcpy (TEST_FOR_DCT, Test_for_miniz, Data_len*sizeof(float)); printf ("\nonly miniz:\n"); Test_miniz ((ConstUnsignedChar*) Test_for_miniz, Data_len *sizeof(float)); printf ("\nwith dct:\n");    Test_dct_miniz (TEST_FOR_DCT, Data_len); if(is_debug_output) { for(inti =0; i < Data_len; ++i) {if(Test_for_miniz[i]! =Test_for_dct[i]) {printf ("index%d:%f! =%f \ n", I, Test_for_miniz[i], test_for_dct[i]); }}} printf ("\ n Press any key to exit.\n"); returnexit_success;} #ifdef __cplusplus}#endif

Project Address: Https://github.com/cpuimage/DCT_8X8

In addition, thanks to the 5.1 holiday, an anonymous netizen of a yuan to reward.

A trickling stream can become a river ~

If you have other related questions or needs, you can contact me to discuss the email.

e-mail address is:
[Email protected]

Floating-point data lossy compression algorithm with full C code

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More