CUDA implements JPEG image decoding to RGB data

Last Update:2015-06-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

people who understand the JPEG data format should be able to imagine that the method of splitting and compressing images with 8*8 pixel block size is very easy to implement with parallel processing ideas. In fact, Nvidia's Cuda has provided examples of JPEG codecs since v5.5. The example is stored in the Cuda SDK, the default installation path for Cuda "C:\ProgramData\NVDIA Corporation\cuda SAMPLES\V7.0\7_CUDALIBRARIES\JPEGNPP (the number after V is changed depending on the version).

This example decodes and re-encodes the picture data, and since decoding only converts the data to YUV, we need to use the example to convert the image to RGB data. Yuv->rgb conversion work. This is really what this article is about to focus on. In addition, since there is a bug in the sample itself, it is not possible to directly decode the different images of the compression aspect ratio, which is mentioned again below, and gives a more trickery fix. This bug has been reported to Nvidia, and Nvidia reply will fix the bug in the next version (that is, the version after v7.0).

Reprint Please specify source: http://blog.csdn.net/weixinhum/article/details/46683509

OK, here we go.

because we need to modify the demo source, or first to the above path will be JPEGNPP A copy of the folder to be backed up. Then we directly open the folder inside the VS project. The main code of the project is in jpegNPP.cpp ,

Inverse dctfor (int i = 0; i < 3; ++i) {    npp_check_npp (nppidctquantinv8x8ls_jpeg_16s8u_c1r_new (apddct[i), aDCTSt Ep[i],                                                          apsrcimage[i], asrcimagestep[i],                                                          pdquantizationtables + oframeheader.aquantizationtableselector[i] * (                                                          asrcsize[i],                                                          pdctstate));}

This code has implemented the ability to decode JPEG images into YUV data, and YUV data is stored in apsrcimage[0],apsrcimage[1],apsrcimage[2] , and its step length (channel width) exists asrcimagestep[0],asrcimagestep[1],asrcimagestep[2] , the known conditions are sufficient, we can directly delete all the code behind the above code (that part of the code is about the image encoding), and then write a cuda processing function to convert YUV to RGB.

The approximate process is as follows:

Configure the OPENCV environment and include the header file (this step is not necessary, just to make it easy to see if the image we turned out is right, if it is not necessary to ignore it, just know the output RGB data pointer and data length urine can):

#include <OPENCV2/CORE/CORE.HPP>//OPENCV include header files  #include <opencv2/highgui/highgui.hpp> #include < Opencv2/opencv.hpp> using namespace std;

write code to implement YUV to RGB:

Add the following code after the demo project code above :

int pwidth = Asrcsize[0].width;int Pheight = asrcsize[0].height;iplimage *drawimg;//data output Image drawimg = Cvcreateimage ( Cvsize (Pwidth, Pheight), 8, 3); npp8u *host_img;//Host memory npp8u *device_img;//graphics memory size_t Mpitch; Npp_check_cuda (Cudamallocpitch (&device_img, &mpitch, Pwidth * 3, pheight));//Open memory space to store RGB data//unsigned char* Imgdata = (unsigned char*) drawimg->imagedata; Ycrcb2rgb (Apsrcimage[0], apsrcimage[1], apsrcimage[2], pwidth, Pheight, asrcimagestep[0],asrcimagestep[1], ASRCIMAGESTEP[2], drawimg->widthstep/sizeof (Uchar), device_img, NMCUBLOCKSV, Nmcublocksh); Npp_check_cuda (Cudahostalloc (&host_img, Pwidth*pheight * 3, Cudahostallocdefault));//Assign Host lock page memory Npp_check_cuda ( cudamemcpy (host_img, Device_img, Pwidth*pheight * 3, Cudamemcpydevicetohost));//Copy the graphics card to finish processing the image to host Drawimg->imagedata = ( char*) Host_img;cvshowimage ("", drawimg); Cvwaitkey (0); GetChar (); for (int i = 0; I < 3; ++i)//Memory Release {Cudafree (apsrcimage[i]); Cudafree (Apddct[i]); Cudafreehost (Aphdct[i]);} Cudafree (device_img); CudafreehoSt (HOST_IMG); Cudadevicereset (); return exit_success;

Add a "cudaycrcb.cu" file to define Ycrcb2rgb function, as to how to set the. cu file if you have any questions, please refer to the previous article for more information, in addition to the Ycrcb2rgb function needs to be under the JpegNPP.cpp file Header declaration . The contents of the file are as follows:

#include "cuda_runtime.h" #include "device_launch_parameters.h" #include "Endianess.h" __device__ unsigned char judge ( int value) {if (value >= 0 && value <= 255) {return value;} else if (value>255) {return 255;} Else{return 0;}}  __global__ void Ycrcb2rgbconver (unsigned char *device_y, unsigned char *device_cr, unsigned char *device_cb, unsigned char *device_img, int width, int height, int ystep, int crstep, int cbstep, int img_step, int nmcublocksv, int nmcublocksh)//place Kernel function {//int tid = blockidx.x*blockdim.x + threadidx.x;int row = blockidx.y*blockdim.y + threadidx.y;int cols = blockIdx.x*b Lockdim.x + threadidx.x;if (row >= height) {return;} if (cols >= width) {return;} int Y = Device_y[row*ystep + cols];int U = Device_cr[row/nmcublocksh*crstep + COLS/NMCUBLOCKSV]-128;int V = Device_C B[row/nmcublocksh*cbstep + COLS/NMCUBLOCKSV]-128;device_img[row*img_step + cols * 3 + 0] =judge (Y + U + (U * 198) & Gt;> 8));D Evice_img[row*img_step + cols * 3 + 1] =judge (Y-((U * ) >> 8) + ((v * 183) >> 8));D Evice_img[row*img_step + cols * 3 + 2] =judge (Y + V + ((v * 103) >> 8)) ;} extern "C" int ycrcb2rgb (unsigned char *device_y, unsigned char *device_cr, unsigned char *device_cb, int width, int heigh T, int ystep, int crstep, int cbstep, int img_step, unsigned char *device_data, int nmcublocksv, int nmcublocksh)//Graphics processing function {cudaevent_t start, stop;float time;cudaeventcreate (&start); cudaeventcreate (&stop); CudaEventRecord (Start, 0)///This part can be adjusted DIM3 threads (16, 16);//thread block thread number 1*1//DIM3 threads (256, 40);//thread block Threads 1*1dim3 Blocks ((width + threads.x-1)/t Hreads.x, (height + threads.y-1)/THREADS.Y);//thread block size ycrcb2rgbconver << <blocks, Threads >> > (device_y , DEVICE_CR, DEVICE_CB, device_data, width, height, ystep, Crstep, Cbstep, Img_step, NMCUBLOCKSV, Nmcublocksh);//Call the graphics card to process the data Cudaeventrecord (stop, 0); Cudaeventsynchronize (stop); Cudaeventelapsedtime (&time, start, stop); Cudaeventdestroy (start); Cudaeventdestroy (stop);p rintf ("nuclear LetterHours:%f\n ", time); return 0;}

The statement reads as follows:

to this, to achieve the content of the article title, for the earlier mention of Nvidia's demo itself exists bug (decoding compression aspect ratio different image memory error), is due to the compression of the aspect ratio is wrong, can be modified by the following way.

CUDA implements JPEG image decoding to RGB data

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

CUDA implements JPEG image decoding to RGB data

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

CUDA implements JPEG image decoding to RGB data

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support