Opencl:opencl's Shader

Source: Internet
Author: User
Tags constant mathematical functions

Spare some time to write a blog post, hoping to help new contacts. I'm here today to learn a little bit about function types

Here is a description of the program function in OpenCL, the program function is usually in textual form, and then the application of Clcreateprogramwithsource such an interface load comes in. This form is often used in shader programming to write code that runs on the GPU, so for clarity and convenience, the source text for these program functions is called the shader of OpenCL.

It's all written in shader. 1 Shader is the language of Class C, derived from the C99 standard (99 ANSI C accepted standard, also the latest standard of C)

Not supported:

header file, function pointer, recursive, variable-length array (this vs also does not support)

Additional Types of addition:

Vector type Char2 ushaort4 int8 these will eventually become length-aligned

Image type image2d_t image3d_t sampler_t ...

Event Type event_t (associated with API Cl_event) 2.work Item and work group related functions


3.vector Manipulation

The first half of the vector is lo and the latter half is hi

Int4 v= (int4) 7 = (INT4) (7,7,7,7)

v= (IN4) (1,2,3,4)

Int2 V2=v.lo

V2=v.hi (3,4)

V2.v.odd (2,4)

For vector arithmetic, ABS is calculated for each element separately

4. Addressing space descriptor, written at the top of the variable, for the address space in which the variable is located

__global

__local

__private

__constant

These four respectively correspond to the storage area in CL Architecture (device Global, work Group, Compute unit, device constant)

The previous __ can also remove the current global must be constant, that is, declare global must be assigned value (Global is the global constant) in different address space of the pointer conversion is not defined

5. Type conversion

5.1convert type conversion; This is the conversion of the variable semantics by type

Written in convert_desttype<_sat><_roundingmode> form,

such as Float4 f4= (FLOAT4) (1.0f,2.0f,3.0f,4.0f)

Int4 I4=convert_int4_sat_rte (f4)

Desttype: Target Type

_sat: Out of range automatically boils down to the number of maximum or minimum performance

_roundingmode:

_rte: Represents the closest even

_rtz: close to 0

_RTP: Toward the positive infinity

_RTN: Towards the negative infinity

The rules are more complicated, see http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/convert_T.html.

5.2 As conversion: This is a new type conversion based on the bit value

Written As_desttype

Where the type of the conversion before and after the Vetctor size is the same, Desttype is the target type, this conversion will adhere to the bit value stability, based on desttype new interpretation of the value

The AS and convert transformations have a substantial distinction.

such as Float4 f4= (FLOAT4) (1.0f,2.0f,3.0f,4.0f)

Int4 I4=as_int4 (f4)

6. Built-in functions:

6.1 More and more mathematical functions

: see

http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/a daily truth
Some cold, some cold, some helpless in my heart, I walk in the night, some trembling, the body huddled, the new is also shaking, I can not see the road ahead, where, feeling confused, the chest is a bit stuffy, I look around, no one's street seems deserted, I feel the whole world will give up. The footsteps of the wandering between, tears already drip ...

The BUILT_IN function section

Put a short summary

6.2work_group function:

Important for the interaction between computer item within a group

Synchronization functions

void Barrier ( Cl_mem_fence_flags flags)

All item within a goup must be completed after this barrier function to continue the subsequent things, but also as this is a synchronization point of all item, no matter who is fast who is slow, must stop at this point, everyone to this point, and then continue.

The parameters here are in two cases:

Clk_local_mem_fence and Clk_global_mem_fence

This parameter I did not make very understand, careless is to join a MEM fence guarantee at this time Loca mem or Globalmem synchronization normal, about MEM fence concept also to see the description of OpenCL

Asynchronous memory copy and prefetch functions

Async_work_group_copy: He will implement an asynchronous memory copy between global and local, which may apply to the DMA engine (DMA data transfer does not apply traditional hardware interrupts, will soon), this function is asynchronous, So it returns an event event_t for synchronization

Apply the Wait_group_events function to wait for the above event to return, for synchronization

Async_work_group_strided_copy: The document says it is used for gather data from SRC to dest, but the meaning of gather in the document can not be well understood, careful analysis, this function with Async_work_group_ The difference between copy and stride is that he is also an asynchronous copy, but it can extract part of the domain from SRC out of DST. For example, in graphics we often use a large array to represent color, normal, texture coordinates, and so on, and they are joined together, such as {color1,ccolor2,color3,tex0,tex1,color1,color2,color3,text0,tex1, ....}, when we need to extract the color information from it, it is necessary to use this stride copy.


Http://www.cnblogs.com/jisi5789/archive/2013/05/22/3093354.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.