Function image generator for GPU parallel computing in. net

Source: Internet
Author: User
Tags amd radeon

Http://www.cnblogs.com/Ninputer/archive/2011/08/18/2145045.html

 

A number of days ago, the technical master vczh developed a function image rendering program that can draw an image with the equation f (x, y) = 0. The principle is to bring the coordinates of each point in the image into function f to obtain two equations for X and Y, and then use the Newton Iteration Method to obtain a set of points and draw them to the image. Use hisProgramYou can draw a variety of amazing equations. However, his program is very slow, because it is a very time-consuming task to use Newton Iteration Method to Solve every point coordinate. Even if parallel. For is adopted, the CPU is very difficult to calculate. After studying his program, I thought I could use a graphics card that is good at parallel computing to accelerate the iteration process. It is no longer appropriate to use opencl to complete this task.

 

The entire process was quite smooth, and it was completely changed based on the original vczh program. Only slightly changed the policy. The procedure is as follows:

    1. After the input function f is parsed, the partial derivatives of F/∂ X and F/∂ y are generated respectively, and then the three binary functions are converted into valid opencl expressions.
    2. Use opencl to implement the Newton Iteration Method.
    3. Assign each point on the image to an opencl thread, and then compute its points by countless parallel opencl threads.

 

OpenclCodeAs follows:

 fp_t func (fp_t X, fp_t y) { return  {dynamic generation};} fp_t df_dx (fp_t X, fp_t y) { return  {dynamic generation};} fp_t df_dy (fp_t X, fp_t y) { return  {dynamic generation };} fp_t solvex (fp_t start,  const  fp_t Consty) { for  ( int  I = 0; I 
       
         If 
        (result <= Epsilon & Result> =-Epsilon) { return  Start ;} fp_t d = df_dx (START, Consty);  If  (d <= Epsilon & D> =-Epsilon) { return  Nan ;}  else  {start-= Result/d ;}}  return  Nan;} kernel  void  Computex (Global write_only fp_t * points,  int  unit,  int  width,  int  CX,  int  cy,  float  origin_x,  float  origin_y) { int  GX = get_global_id (0);  int  Gy = get_global_id (1); uint write_loc = GX + Gy * width; fp_t py = origin_y + (fp_t) (GY + 1-cy) /unit; fp_t PX = origin_x + (fp_t) (CX-GX-1)/unit; points [write_loc] = solvex (PX, Py);} 

This is the code for solving f (x, A) = 0. It is basically the same for solving F (B, y) = 0. Fp_t is a typedef defined based on the situation, which may be float or double. Because not all opencl devices support double-precision floating points, you must write them as genericAlgorithmAnd macro control.

 

The easiest way to use opencl in. NET is to useClooLibrary. The CoO library fully encapsulates all functions of opencl (1.1) and is an object model that is very easy to use by. net. I only used cloo and opencl once, and I don't want to endure the troublesome library of DirectX computeshader ..

 

My programsSource codeIt has been completely uploaded to GitHub. Https://github.com/Ninputer/opencl-plot click Download to package all the code. You can clickHereDownload the Binary Package.

 

To run this program, you must install the opencl implementation platform. Currently, opencl on Windows mainly provides implementation platforms from NVIDIA, AMD, and Intel. If you have a relatively new NVIDIA or AMD graphics card, you only need to install the latest driver package with opencl. The following video cards support Double Precision Floating Point: NVIDIA geforce 200 series, 400 series, 500 series graphics cards; amd radeon HD 5800, 5900, 6900 series. The radeon 6900 series does not yet support the official dual-precision floating point number (cl_khr_fp64) extension, so this program also supports cl_amd_fp64 dual-precision floating point extension, which has the same functionality. Graphics cards in the g80 and rv770 architectures, as well as amd low-end graphics cards, can only support single-precision floating-point numbers, and may be slightly less accurate during painting.

 

Users who do not support opencl graphics cards can use multi-core CPUs for opencl computing, which is still faster than the original C # version. If you use Intel Core I3, I5, i7 series CPU, you can use intel opencl SDK,: http://software.intel.com/en-us/articles/opencl-sdk/ other multi-core CPU can use amd app SDK,: http://developer.amd.com/sdks/AMDAPPSDK/downloads/Pages/default.aspx

 

After the program is started, you can select the opencl computing platform and device. If multiple opencl platforms are installed, you can choose any one. Currently, this program does not support multi-video parallel technology (SLI and crossfire ). NVIDIA Cuda platform interface Example:

 

AMD app platform interface Example:

 

Intel opencl platform interface Example:

 

Enter the equation to make full use of your imagination!

Note: When using graphics card computing, it is best not to play games. When using ie9 to browse webpages, or so on, the GPU may be reset when the load is too heavy. If some formulas require too much computation or the graphics card is low-end and may cause GPU reset, use Windows 7 and vista for experiments, the wddm driver model is more stable (it is easy to use blue screen XP ).

 

Welcome to my meager get the latest dynamic http://weibo.com/ninputer

 


Green Channel: Please follow my favorites to contact me

 

A number of days ago, the technical master vczh developed a function image rendering program that can draw an image with the equation f (x, y) = 0. The principle is to bring the coordinates of each point in the image into function f to obtain two equations for X and Y, and then use the Newton Iteration Method to obtain a set of points and draw them to the image. His program can be used to draw a variety of amazing equations. However, his program is very slow, because it is a very time-consuming task to use Newton Iteration Method to Solve every point coordinate. Even if parallel. For is adopted, the CPU is very difficult to calculate. After studying his program, I thought I could use a graphics card that is good at parallel computing to accelerate the iteration process. It is no longer appropriate to use opencl to complete this task.

 

The entire process was quite smooth, and it was completely changed based on the original vczh program. Only slightly changed the policy. The procedure is as follows:

    1. After the input function f is parsed, the partial derivatives of F/∂ X and F/∂ y are generated respectively, and then the three binary functions are converted into valid opencl expressions.
    2. Use opencl to implement the Newton Iteration Method.
    3. Assign each point on the image to an opencl thread, and then compute its points by countless parallel opencl threads.

 

The opencl code is as follows:

 fp_t func (fp_t X, fp_t y) { return  {dynamic generation};} fp_t df_dx (fp_t X, fp_t y) { return  {dynamic generation};} fp_t df_dy (fp_t X, fp_t y) { return  {dynamic generation };} fp_t solvex (fp_t start,  const  fp_t Consty) { for  ( int  I = 0; I 
       
         If 
        (result <= Epsilon & Result> =-Epsilon) { return  Start ;} fp_t d = df_dx (START, Consty);  If  (d <= Epsilon & D> =-Epsilon) { return  Nan ;}  else  {start-= Result/d ;}}  return  Nan;} kernel  void  Computex (Global write_only fp_t * points,  int  unit,  int  width,  int  CX,  int  cy,  float  origin_x,  float  origin_y) { int  GX = get_global_id (0);  int  Gy = get_global_id (1); uint write_loc = GX + Gy * width; fp_t py = origin_y + (fp_t) (GY + 1-cy) /unit; fp_t PX = origin_x + (fp_t) (CX-GX-1)/unit; points [write_loc] = solvex (PX, Py);} 

This is the code for solving f (x, A) = 0. It is basically the same for solving F (B, y) = 0. Fp_t is a typedef defined based on the situation, which may be float or double. Because not all opencl devices support double-precision floating points, we need to write common types of algorithms and use macros to control them.

 

The easiest way to use opencl in. NET is to useClooLibrary. The CoO library fully encapsulates all functions of opencl (1.1) and is an object model that is very easy to use by. net. I only used cloo and opencl once, and I don't want to endure the troublesome library of DirectX computeshader ..

 

The source code of my program has been completely uploaded to GitHub. Https://github.com/Ninputer/opencl-plot click Download to package all the code. You can clickHereDownload the Binary Package.

 

To run this program, you must install the opencl implementation platform. Currently, opencl on Windows mainly provides implementation platforms from NVIDIA, AMD, and Intel. If you have a relatively new NVIDIA or AMD graphics card, you only need to install the latest driver package with opencl. The following video cards support Double Precision Floating Point: NVIDIA geforce 200 series, 400 series, 500 series graphics cards; amd radeon HD 5800, 5900, 6900 series. The radeon 6900 series does not yet support the official dual-precision floating point number (cl_khr_fp64) extension, so this program also supports cl_amd_fp64 dual-precision floating point extension, which has the same functionality. Graphics cards in the g80 and rv770 architectures, as well as amd low-end graphics cards, can only support single-precision floating-point numbers, and may be slightly less accurate during painting.

 

Users who do not support opencl graphics cards can use multi-core CPUs for opencl computing, which is still faster than the original C # version. If you use Intel Core I3, I5, i7 series CPU, you can use intel opencl SDK,: http://software.intel.com/en-us/articles/opencl-sdk/ other multi-core CPU can use amd app SDK,: http://developer.amd.com/sdks/AMDAPPSDK/downloads/Pages/default.aspx

 

After the program is started, you can select the opencl computing platform and device. If multiple opencl platforms are installed, you can choose any one. Currently, this program does not support multi-video parallel technology (SLI and crossfire ). NVIDIA Cuda platform interface Example:

 

AMD app platform interface Example:

 

Intel opencl platform interface Example:

 

Enter the equation to make full use of your imagination!

Note: When using graphics card computing, it is best not to play games. When using ie9 to browse webpages, or so on, the GPU may be reset when the load is too heavy. If some formulas require too much computation or the graphics card is low-end and may cause GPU reset, use Windows 7 and vista for experiments, the wddm driver model is more stable (it is easy to use blue screen XP ).

 

Welcome to my meager get the latest dynamic http://weibo.com/ninputer

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.