Opencl learning step by step (3) stores the Kernel File as binary

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In tutorial 2, we use the converttostring function to read the kernel source file to a string, then use the clcreateprogramwithsource function to load the program object, and then call the clbuildprogram function to compile the program object. In fact, we can also directly call the binary Kernel File, so that when you do not want to show the Kernel File to others, it will play a certain role of confidentiality. In this tutorial, We will store the read source file in a binary file, and create a Timer class to record the time when the array addition was executed separately on the CPU and GPU.

First, create the project file gcltutorial, and add the class gclfile in it. This class is mainly used to read text kernel files or read and write binary kernel files.

Class gclfile
{
Public:
Gclfile (void );
~ Gclfile (void );

// Open the opencl kernel source file (text mode)
Bool open (const char * filename );

// Read and write binary kernel files
Bool writebinarytofile (const char * filename, const char * birary, size_t numbytes );
Bool readbinaryfromfile (const char * filename );

...

}

The function code for reading and writing the three kernel files in gclfile is:

Bool gclfile: writebinarytofile (const char * filename, const char * birary, size_t numbytes)
{
File * output = NULL;
Output = fopen (filename, "WB ");
If (output = NULL)
Return false;

Fwrite (birary, sizeof (char), numbytes, output );
Fclose (output );

Return true;
}


Bool gclfile: readbinaryfromfile (const char * filename)
{
File * input = NULL;
Size_t size = 0;
Char * binary = NULL;

Input = fopen (filename, "rb ");
If (input = NULL)
{
Return false;
}

Fseek (input, 0l, seek_end );
Size = ftell (input );
// Point to the start position of the file
Rewind (input );
Binary = (char *) malloc (size );
If (Binary = NULL)
{
Return false;
}
Fread (binary, sizeof (char), size, input );
Fclose (input );
Source _. Assign (binary, size );
Free (Binary );

Return true;
}

Bool gclfile: open (const char * filename )//! <File name
{
Size_t size;
Char * STR;

// Open a file as a stream
STD: fstream F (filename, (STD: fstream: In | STD: fstream: Binary ));

// Check whether the file stream is enabled
If (F. is_open ())
{
Size_t sizefile;
// Get the file size
F. seekg (0, STD: fstream: End );
Size = sizefile = (size_t) F. tellg ();
F. seekg (0, STD: fstream: Beg );

STR = new char [size + 1];
If (! Str)
{
F. Close ();
Return false;
}

// Read the file
F. Read (STR, sizefile );
F. Close ();
STR [size] = '\ 0 ';

Source _ = STR;

Delete [] STR;

Return true;
}

Return false;
}

Now, in Main. cpp, we can use the open function of the gclfile class to read the kernel source file:

// The Kernel File is add. cl.

Gclfile kernelfile;

If (! Kernelfile. Open ("Add. Cl "))

{

Printf ("failed to load Kernel File \ n ");

Exit (0 );

}

Const char * Source = kernelfile. Source (). c_str ();

Size_t sourcesize [] = {strlen (source )};

// Create a program object

Cl_program program = clcreateprogramwithsource (

Context,

& Source,

Sourcesize,

Null );

After compiling the kernel, we can use the following code to store the compiled kernel in a binary file addvec. bin. In tutorial 4, we will directly mount the binary Kernel File.

// Store the compiled Kernel File
Char ** binaries = (char **) malloc (sizeof (char *) * 1); // only one device
Size_t * binarysizes = (size_t *) malloc (sizeof (size_t) * 1 );

Status = clgetprograminfo (program,
Cl_program_binary_sizes,
Sizeof (size_t) * 1,
Binarysizes, null );
Binaries [0] = (char *) malloc (sizeof (char) * binarysizes [0]);
Status = clgetprograminfo (program,
Cl_program_binaries,
Sizeof (char *) * 1,
Binaries,
Null );
Kernelfile. writebinarytofile ("vecadd. bin", binaries [0], binarysizes [0]);

We will also create a Timer class gcltimer to calculate the time. This class mainly uses queryperformancefrequency to get the clock frequency, queryperformancecounter to get the number of ticks that have elapsed, and finally get the elapsed time. The function is very simple,

Class gcltimer

{

Public:

Gcltimer (void );

~ Gcltimer (void );

PRIVATE:

Double _ freq;

Double _ clocks;

Double _ start;

Public:

Void start (void); // start the timer

Void stop (void); // stop the timer

Void reset (void); // reset timer

Double getelapsedtime (void); // calculate the elapsed time

};

The following code adds a timer when adding an array on the CPU:

Gcltimer cltimer;

Cltimer. Reset ();

Cltimer. Start ();

// Compute the sum of buf1, buf2, and CPU.

For (I = 0; I <bufsize; I ++)

Buf [I] = buf1 [I] + buf2 [I];

Cltimer. Stop ();

Printf ("CPU costs time: %. 6f MS \ n", cltimer. getelapsedtime () * 1000 );

Similarly, when the GPU executes the kernel code and copies the GPU result to the CPU, add the timer code:

// Run the kernel command. The range value is 1 dimension, and the work itmes size is bufsize,
Cl_event EV;
Size_t global_work_size = bufsize;

Cltimer. Reset ();
Cltimer. Start ();
Clenqueuendrangekernel (queue,
Kernel,
1,
Null,
& Global_work_size,
Null, 0, null, & eV );
Status = clflush (Queue );
Waitforeventandrelease (& eV );
// Clwaitforevents (1, & eV );

Cltimer. Stop ();
Printf ("kernal total time: %. 6f MS \ n", cltimer. getelapsedtime () * 1000 );

// Copy data back to host memory
Cl_float * PTR;
Cltimer. Reset ();
Cltimer. Start ();
Cl_event mapevt;
PTR = (cl_float *) clenqueuemapbuffer (queue,
Buffer,
Cl_true,
Cl_map_read,
0,
Bufsize * sizeof (cl_float ),
0, null, & mapevt, null );
Status = clflush (Queue );
Waitforeventandrelease (& mapevt );
// Clwaitforevents (1, & mapevt );

Cltimer. Stop ();
Printf ("copy from device to host: %. 6f MS \ n", cltimer. getelapsedtime () * 1000 );

The final program execution interface is as follows. When the bufsize is 262144, the GPU has a CPU speed on my graphics card ..., In the program directory, we can see that the vecadd. binfile is also generated.

Complete code can be found:

Project File gcltutorial

Download Code:

Http://files.cnblogs.com/mikewolf2002/gclTutorial.zip

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Opencl learning step by step (3) stores the Kernel File as binary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Opencl learning step by step (3) stores the Kernel File as binary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support