GPU programming of OpenGL core technology

Source: Internet
Author: User
Tags arrays bind scalar

The author introduces: Jiang Xiewei, IT company technology partner, it senior lecturer, CSDN Community expert, guest editor, best-selling author, national patent inventor; published books: Teach You architecture 3D game engine, electronic industry press andUnity3d Actual combat core technical details of the electronic industry publishing house.

CSDN Video URL: http://edu.csdn.net/lecturer/144

3D game engine is the core of rendering, game quality improvement needs to be shader programming implementation of rendering technology, the usual way of rendering usually through Direct3D or OpenGL, for the current more popular engine unity3d,cocos2d-x, The UE4 engine renders on the mobile side with OpenGL, so mastering OpenGL rendering is important, which helps us understand how the engine is implemented inside.

For shader scripts, the implementation is mainly divided into vertex shader and fragment shader, the vertex shader calculates the value is passed to the fragment shader to use, the following describes the core content of shader programming in detail.

Each time we intend to send data from the vertex to the fragment shader, we declare an output/input variable that matches each other. It's easiest to send data from one shader to another, one at a time, but as your application gets bigger, you might want to send more than just variables, preferably including arrays and structs.

To help us organize these variables, GLSL provides us with something called Interface blocks (Interface Blocks) so that we can organize these variables. Declaring the interface block and declaring the struct is a bit like, the difference is that it is now based on a block, using the in and out keywords to declare, and finally it will be an input or output block (block).

#version the core
layout (location = 0) in vec3 position;
Layout (location = 1) in vec2 texcoords;

Uniform MAT4 model;
Uniform mat4 view;
Uniform mat4 projection;

Out vs_out
{
    vec2 texcoords;
} vs_out;

void Main ()
{
    gl_position = projection * View * model * VEC4 (Position, 1.0f);
    Vs_out. Texcoords = texcoords;
}
This time we declare an interface block called Vs_out, which combines all the output variables we need to send to the next stage shader. Although this is a trivial example,
But you can imagine that it does help us to organize the input and output of shaders.

Then we also need to declare an input interface block in the next shader-fragment shader. The block name should be the same, but the instance name can be arbitrary.

#version the core out
vec4 color;

In Vs_out
{
    vec2 texcoords;
} fs_in;

Uniform sampler2d texture;

void Main ()
{
    color = texture (texture, fs_in. texcoords);
}

If the two interface block names are identical, their corresponding inputs and outputs will match up. This is another useful feature that can help us organize your code, especially in the case of a cross-shading phase, such as a geometry shader.

If you've been using OpenGL for a long time, you've learned some cool tricks, but you've had some trouble. For example, when using more than one shader, we had to set the uniform variable again and again, even though they were the same for each shader, so why bother setting them up multiple times.

OpenGL provides us with a tool called the Uniform buffer object (Uniform), which enables us to declare a series of global Uniform variables that are consistent across several shader programs. The associated uniform can only be set once when the object is buffered with uniform. We still need to manually set a unique uniform for each shader. Creating and configuring a uniform buffer object takes a bit of effort.

Because the UNIFORM buffer object is a buffer, we can use Glgenbuffers to create one, bind to the Gl_uniform_buffer buffer target, and then buffer all the related UNIFORM data. There are some principles, like how the uniform buffer object stores data, which we'll discuss later. First we store the projection and view matrices in a simple vertex shader with uniform blocks (uniform block):

#version the core
layout (location = 0) in vec3 position;

Layout (std140) uniform matrices
{
    mat4 projection;
    MAT4 view;
};

Uniform MAT4 model;

void Main ()
{
    gl_position = projection * View * model * VEC4 (Position, 1.0);
}

Earlier, in most cases we set uniform for the projection and view matrices for each rendering iteration. This example uses the uniform buffer object, which is very useful, because these matrices are set once on the line.

Here we declare a block of uniform called Matrices, which stores two 4x4 matrices. Variables in the uniform block can be obtained directly without using the block name as a prefix. We then store the values of these matrices in the buffer, and each shader that declares the uniform block is able to get the matrix.

Now you may wonder what layout (std140) means. It means that the currently defined uniform block uses a specific memory layout for its contents, which is actually a set of uniform block layouts (uniform block layout).

The contents of a uniform block are stored in a buffer object, which is actually in one piece of memory. Because this memory is not clear about what type of data it holds, we have to tell OpenGL which memory corresponds to which uniform variable in the shader.

Imagine that the following uniform block is in a shader:

Layout (std140) uniform exampleblock
{
    float value;
    VEC3 Vector;
    MAT4 Matrix;
    float values[3];
    BOOL Boolean;
    int integer;
};

What we want to know is the size of each variable (in bytes) and the offset (from the beginning of the block), so we can put them in a buffer in their own order. The size of each element is clear in OpenGL, directly echoing the C + + data type, and the vectors and matrices are a float sequence (array). What OpenGL does not clarify is the spacing between variables. This allows the hardware to be variable in the position it deems appropriate. For example, some hardware can place a vec3 next to float. Not all hardware can do this, before attaching a float to VEC3, add a margin to the VEC3 to make it into 4 (spatially contiguous) float arrays. The function is very good, but it is inconvenient for us to use.

The Uniform memory layout that GLSL uses by default is called Shared layout, which is called sharing because once offsets are defined by the hardware, they are continuously shared by multiple programs. With shared layouts, GLSL can reposition uniform variables for optimization, as long as the order of the variables remains intact. Because we don't know what the offset of each uniform variable is, we don't know how to fill the uniform buffer exactly. We can use functions like glgetuniformindices to query this information, but this is beyond the scope of this section of the tutorial.

Because the shared layout gives us some space optimizations. Sharing layouts are not usually applied in practice, but instead use std140 layouts. Std140 declares their respective offsets through the specification of a series of rules, and the std140 layout explicitly declares the layout of the memory for each variable type. Because of the explicit mention, we can manually calculate the offset of each variable.

Each variable has a baseline alignment (base alignment), which is equal to the space (including margins) of this variable in a uniform block, which is calculated using the std140 layout principle. Then, we calculate its alignment offset (aligned offset) for each variable, which is the byte offset from the beginning of the block at the start of a variable. The byte offset of the variable alignment must be equal to the multiple of its baseline alignment.

Accurate layout rules can be found in OpenGL's uniform buffer specification, but we will list the most common specifications. Each variable type in the GLSL, such as int, float, and bool, is defined as 4 bytes, and every 4 bytes is represented as N.

type Layout Specifications
Scalars such as int and bool The baseline for each scalar is n
Vector The baseline for each vector is 2N or 4N in size. This means that the baseline for VEC3 is 4N
Scalar and vector arrays The baseline of each element is the same as the VEC4
Matrix is seen as an array of vectors, each with the same cardinality as the VEC4
Structural body Each element is calculated according to the rules above, and the spacing must be a multiple of the VEC4 baseline
Like most OpenGL specifications, an example is easy to understand. Using the uniform block Exampleblock previously described, we use the std140 layout to calculate the aligned offset (alignment offset) of each of its members:
layout (std140) uniform Exampleblock {//Base alignment----------//aligned offs     ET float value;     4//0 VEC3 vector;     16//16 (must be a multiple of 16, therefore 4->16) mat4 matrix;
                     16//32 (line No. 0)//16//48 (line 1th)
    16//64 (line 2nd)//16//80 (line 3rd) float values[3];
                     16 (the scalar in the array is the same as VEC4)//96 (values[0])//+//(Values[1])    +//(Values[2]) bool Boolean;     4//144 int integer; 4//148}; 

Try to figure out the offsets, compare them to the table, and you can take this as an exercise. Using the computed offsets, according to the std140 layout rules, we can populate the buffers with variable data using functions such as glbuffersubdata. Although not very efficient, the std140 layout ensures that the layout of the uniform block declared in each program remains consistent.

Adding the layout (std140) declaration before defining the uniform block allows us to tell OpenGL that the uniform block uses the std140 layout. There are also two other layouts to choose from, which require us to query each offset before populating the buffer. We have learned that shared layouts and other layouts will be encapsulated (packed). When using the encapsulation (packed) layout, there is no guarantee that the layout will be consistent in other programs because it allows the compiler to optimize the uniform variable from the uniform block, which may be different in each shader.

We discussed the definition of uniform blocks in shaders and how to define their memory layouts, but we have not yet discussed how to use them.

First we need to create a uniform buffer object, which needs to be done using glgenbuffers. When we have a buffer object, we bind it to the Gl_uniform_buffer target and call Glbufferdata to allocate enough space for it.

Gluint Uboexampleblock;
Glgenbuffers (1, &uboexampleblock);
Glbindbuffer (Gl_uniform_buffer, uboexampleblock);
Glbufferdata (Gl_uniform_buffer, Max, NULL, Gl_static_draw); Allocates 150 bytes of memory space
Glbindbuffer (gl_uniform_buffer, 0);

Now at any time when we are going to update or insert data into the buffer, we bind to Uboexampleblock and use Glbuffersubdata to update its memory. We just need to update this uniform buffer once, and all of this buffer shader will use its updated data. However, OpenGL is how to know which uniform buffer corresponds to which uniform block.

In the OpenGL environment (context), a number of binding points (binding points) are defined, where we can link a uniform buffer. When we create a uniform buffer, and we link it to a binding point, we also link the uniform blocks in the shader to the same binding point, so that they are linked together. The following icons indicate this:

As you can see, we can bind multiple uniform buffers to different binding points. Because shader A and shader B have a uniform block that is linked to the same binding point 0, their uniform block shares the same uniform data-ubomatrices there is a precondition that two shaders must all define matrices this uniform block.

We call the Gluniformblockbinding function to set the uniform block to a specific binding point. The first parameter of the function is a program object, followed by a uniform Block index (uniform block indexes) and a binding point to be linked. The Uniform Block Index is the index position of the uniform block defined in a shader and can be called Glgetuniformblockindex to get the value, which receives a program object and the name of the uniform block. We can set lights this uniform block from the chart to link to the binding point 2:

Gluint Lights_index = Glgetuniformblockindex (Shadera.program, "lights");
Gluniformblockbinding (Shadera.program, Lights_index, 2);

Note that we have to do this repeatedly in each shader.

From OpenGL4.2, you can also store a uniform block's binding point in the shader by adding another layout identifier, so we don't have to call Glgetuniformblockindex and gluniformblockbinding. The following delegate explicitly sets the binding point for this uniform block of lights:

Layout (std140, binding = 2) uniform Lights {...};

Then we also need to bind the uniform buffer object to the same binding point, which can be done using glbindbufferbase or Glbindbufferrange.

Glbindbufferbase (Gl_uniform_buffer, 2, uboexampleblock);
or
Glbindbufferrange (Gl_uniform_buffer, 2, Uboexampleblock, 0, 150);

The function glbindbufferbase receives a target, a binding-point index, and a uniform buffer object as its arguments. This function links the uboexampleblock to the binding point 2, which is linked from both ends of the binding point. You can also use the Glbindbufferrange function, which also requires an offset and size as a parameter, so you can bind only a certain range of uniform buffers to a binding point. Using the Glbindbufferrage function, you can link multiple different uniform blocks to the same uniform buffer object.

Now that everything is done, we can start adding data to the uniform buffer. We can use Glbuffersubdata to add all the data as a single byte array or to update the buffered portions as long as we want. In order to update the uniform variable Boolean, we can update the uniform buffer object in this way:

Glbindbuffer (Gl_uniform_buffer, uboexampleblock);
Glint B = true; The Boolean value in GLSL is 4 bytes, so we create it as a 4-byte integer
glbuffersubdata (Gl_uniform_buffer, 142, 4, &b);
Glbindbuffer (gl_uniform_buffer, 0);
The same processing can be applied to other uniform variables in the uniform block.

Here's a simple example to tell the reader:

Examples of using uniform buffer objects. If we look back at all the previous demo code, we've been using 3 matrices: projections, views, and model matrices. Of all these matrices, only the model matrix is frequently changed. If we have more than one shader using these matrices, we might be better off using uniform buffer objects.

We will store the projection and view matrix in a uniform block, which is named matrices. We are not going to store the model matrix because the model matrix will change frequently between shaders, so using uniform to buffer objects really does not bring any benefit.

#version the core
layout (location = 0) in vec3 position;

Layout (std140) uniform matrices
{
    mat4 projection;
    MAT4 view;
};
Uniform MAT4 model;

void Main ()
{
    gl_position = projection * View * model * VEC4 (Position, 1.0);
}

There is nothing special here except that we now use a uniform block with a std140 layout. We will display 4 cubes in the routine, each using a different shader program. 4 Shader programs Use the same vertex shader, but they will use their respective fragment shaders, each of which outputs a single color.

First, we set the uniform block of the vertex shader to the binding point 0. Note that we have to do this for each shader.

Gluint uniformblockindexred = Glgetuniformblockindex (Shaderred.program, "matrices");
Gluint Uniformblockindexgreen = Glgetuniformblockindex (Shadergreen.program, "matrices");
Gluint Uniformblockindexblue = Glgetuniformblockindex (Shaderblue.program, "matrices");
Gluint Uniformblockindexyellow = Glgetuniformblockindex (Shaderyellow.program, "matrices");  

Gluniformblockbinding (Shaderred.program, uniformblockindexred, 0);
Gluniformblockbinding (Shadergreen.program, Uniformblockindexgreen, 0);
Gluniformblockbinding (Shaderblue.program, Uniformblockindexblue, 0);
Gluniformblockbinding (Shaderyellow.program, Uniformblockindexyellow, 0);
We then create a real uniform buffer object and bind the buffer to the binding point 0:

Gluint ubomatrices
glgenbuffers (1, &ubomatrices);

Glbindbuffer (Gl_uniform_buffer, ubomatrices);
Glbufferdata (Gl_uniform_buffer, 2 * sizeof (GLM::MAT4), NULL, Gl_static_draw);
Glbindbuffer (gl_uniform_buffer, 0);

Glbindbufferrange (gl_uniform_buffer, 0, ubomatrices, 0, 2 * sizeof (GLM::MAT4));

We first allocate enough memory for the buffer, which is equal to twice times the GLM::MAT4. The size of the matrix type of GLM corresponds directly to the GLSL mat4. Then we link a specific range of buffers to the binding point 0, which should be the entire buffer.

Now all that's left to do is fill the buffer. If we keep the field of view value as a constant projection matrix (so there's no camera scaling), we just have to define it once in the program, which means we just need to insert it into the buffer once. Since we have allocated enough memory in the buffer object, we can use Glbuffersubdata to store the projection matrix before we enter the game loop:

Glm::mat4 projection = glm::p erspective (45.0f, (float) width/(float) height, 0.1f, 100.0f);
Glbindbuffer (Gl_uniform_buffer, ubomatrices);
Glbuffersubdata (gl_uniform_buffer, 0, sizeof (GLM::MAT4), glm::value_ptr (projection));
Glbindbuffer (gl_uniform_buffer, 0);
Here we use the projection matrix to store the first half of the uniform buffer. Before we draw an object in each rendering iteration, we update the second part of the buffer with a view matrix:
GLM::MAT4 view = camera. Getviewmatrix ();
Glbindbuffer (Gl_uniform_buffer, ubomatrices);
Glbuffersubdata (
  gl_uniform_buffer, sizeof (GLM::MAT4), sizeof (GLM::MAT4), glm::value_ptr (view));
Glbindbuffer (gl_uniform_buffer, 0);

This is the uniform buffer object. Each vertex shader containing the uniform block of matrices will correspond to the data stored by the ubomatrices. So if we now draw 4 cubes using 4 different shaders, their projection and view matrices are the same:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.