Opencl learning step by step (3) stores the Kernel File as binary

Source: Internet
Author: User

In tutorial 2, we use the converttostring function to read the kernel source file to a string, then use the clcreateprogramwithsource function to load the program object, and then call the clbuildprogram function to compile the program object. In fact, we can also directly call the binary Kernel File, so that when you do not want to show the Kernel File to others, it will play a certain role of confidentiality. In this tutorial, We will store the read source file in a binary file, and create a Timer class to record the time when the array addition was executed separately on the CPU and GPU.

First, create the project file gcltutorial, and add the class gclfile in it. This class is mainly used to read text kernel files or read and write binary kernel files.

Class gclfile
{
Public:
Gclfile (void );
~ Gclfile (void );

// Open the opencl kernel source file (text mode)
Bool open (const char * filename );

// Read and write binary kernel files
Bool writebinarytofile (const char * filename, const char * birary, size_t numbytes );
Bool readbinaryfromfile (const char * filename );

...

}

The function code for reading and writing the three kernel files in gclfile is:

Bool gclfile: writebinarytofile (const char * filename, const char * birary, size_t numbytes)
{
File * output = NULL;
Output = fopen (filename, "WB ");
If (output = NULL)
Return false;

Fwrite (birary, sizeof (char), numbytes, output );
Fclose (output );

Return true;
}


Bool gclfile: readbinaryfromfile (const char * filename)
{
File * input = NULL;
Size_t size = 0;
Char * binary = NULL;

Input = fopen (filename, "rb ");
If (input = NULL)
{
Return false;
}

Fseek (input, 0l, seek_end );
Size = ftell (input );
// Point to the start position of the file
Rewind (input );
Binary = (char *) malloc (size );
If (Binary = NULL)
{
Return false;
}
Fread (binary, sizeof (char), size, input );
Fclose (input );
Source _. Assign (binary, size );
Free (Binary );

Return true;
}

Bool gclfile: open (const char * filename )//! <File name
{
Size_t size;
Char * STR;

// Open a file as a stream
STD: fstream F (filename, (STD: fstream: In | STD: fstream: Binary ));

// Check whether the file stream is enabled
If (F. is_open ())
{
Size_t sizefile;
// Get the file size
F. seekg (0, STD: fstream: End );
Size = sizefile = (size_t) F. tellg ();
F. seekg (0, STD: fstream: Beg );

STR = new char [size + 1];
If (! Str)
{
F. Close ();
Return false;
}

// Read the file
F. Read (STR, sizefile );
F. Close ();
STR [size] = '\ 0 ';

Source _ = STR;

Delete [] STR;

Return true;
}

Return false;
}

Now, in Main. cpp, we can use the open function of the gclfile class to read the kernel source file:

// The Kernel File is add. cl.

Gclfile kernelfile;

If (! Kernelfile. Open ("Add. Cl "))

{

Printf ("failed to load Kernel File \ n ");

Exit (0 );

}

Const char * Source = kernelfile. Source (). c_str ();

Size_t sourcesize [] = {strlen (source )};

// Create a program object

Cl_program program = clcreateprogramwithsource (

Context,

1,

& Source,

Sourcesize,

Null );

After compiling the kernel, we can use the following code to store the compiled kernel in a binary file addvec. bin. In tutorial 4, we will directly mount the binary Kernel File.

// Store the compiled Kernel File
Char ** binaries = (char **) malloc (sizeof (char *) * 1); // only one device
Size_t * binarysizes = (size_t *) malloc (sizeof (size_t) * 1 );

Status = clgetprograminfo (program,
Cl_program_binary_sizes,
Sizeof (size_t) * 1,
Binarysizes, null );
Binaries [0] = (char *) malloc (sizeof (char) * binarysizes [0]);
Status = clgetprograminfo (program,
Cl_program_binaries,
Sizeof (char *) * 1,
Binaries,
Null );
Kernelfile. writebinarytofile ("vecadd. bin", binaries [0], binarysizes [0]);

We will also create a Timer class gcltimer to calculate the time. This class mainly uses queryperformancefrequency to get the clock frequency, queryperformancecounter to get the number of ticks that have elapsed, and finally get the elapsed time. The function is very simple,

Class gcltimer

{

Public:

Gcltimer (void );

~ Gcltimer (void );

PRIVATE:

Double _ freq;

Double _ clocks;

Double _ start;

Public:

Void start (void); // start the timer

Void stop (void); // stop the timer

Void reset (void); // reset timer

Double getelapsedtime (void); // calculate the elapsed time

};

The following code adds a timer when adding an array on the CPU:

Gcltimer cltimer;

Cltimer. Reset ();

Cltimer. Start ();

// Compute the sum of buf1, buf2, and CPU.

For (I = 0; I <bufsize; I ++)

Buf [I] = buf1 [I] + buf2 [I];

Cltimer. Stop ();

Printf ("CPU costs time: %. 6f MS \ n", cltimer. getelapsedtime () * 1000 );

Similarly, when the GPU executes the kernel code and copies the GPU result to the CPU, add the timer code:

// Run the kernel command. The range value is 1 dimension, and the work itmes size is bufsize,
Cl_event EV;
Size_t global_work_size = bufsize;

Cltimer. Reset ();
Cltimer. Start ();
Clenqueuendrangekernel (queue,
Kernel,
1,
Null,
& Global_work_size,
Null, 0, null, & eV );
Status = clflush (Queue );
Waitforeventandrelease (& eV );
// Clwaitforevents (1, & eV );

Cltimer. Stop ();
Printf ("kernal total time: %. 6f MS \ n", cltimer. getelapsedtime () * 1000 );

// Copy data back to host memory
Cl_float * PTR;
Cltimer. Reset ();
Cltimer. Start ();
Cl_event mapevt;
PTR = (cl_float *) clenqueuemapbuffer (queue,
Buffer,
Cl_true,
Cl_map_read,
0,
Bufsize * sizeof (cl_float ),
0, null, & mapevt, null );
Status = clflush (Queue );
Waitforeventandrelease (& mapevt );
// Clwaitforevents (1, & mapevt );

Cltimer. Stop ();
Printf ("copy from device to host: %. 6f MS \ n", cltimer. getelapsedtime () * 1000 );

The final program execution interface is as follows. When the bufsize is 262144, the GPU has a CPU speed on my graphics card ..., In the program directory, we can see that the vecadd. binfile is also generated.

Complete code can be found:

Project File gcltutorial

Download Code:

Http://files.cnblogs.com/mikewolf2002/gclTutorial.zip

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.