GPU deep mining (4 )::
Render to vertexbuffer in OpenGL
Author: 文: 2007/5/10 www.physdev.com. To implement GPU programming, a good theoretical basis is required. If you do not have the foundation in this area before, please first learn the relevant knowledge. We recommend that you read the article gpgpu: Basics of mathematics tutorial.
Overview:
PbO: Pixel Buffer object
FBO: frame buffer object
VBO: vertex buffer object
The following describes two different implementation methods. With the continuous development of the video card, there may be better solutions, but the following two methods are currently commonly used.
Method 1: Read the texture (vertex texture fetch) during Vertex coloring ).
Can take a look at the following paper: http://developer.nvidia.com/object/using_vertex_textures.html
Advantages:
- You do not need to copy data to VBO.
- Flexible texture Data Reading
- A small amount of memory resources are consumed.
Disadvantages:
- Only NVIDIA n3.0 and above are supported.
- Only a small number of floating point texture formats are supported (gl_rgba_float32_ati)
- The old video card (before the GFX series) is not supported. It is only available after the gf6 series.
- A little slower. The gf6600gt reads a texture at a speed of 30 Mb/s. If the texture needs to be processed by FBO, it will be slower. Of course, the slow speed here is relative to the second method that will be introduced later, but this method is much faster than reading the texture back to the CPU memory.
Method 2: Copy to Pixel Buffer (PBO)
PbO refers to the Pixel Buffer object. PBO can be directly converted to VBO for rendering of vertex arrays.
Advantages:
- Supporting Multiple PbO data formats, rather than floating-point rgba.
- Faster. Rendering FBO + copying FBO to PbO + rendering vbo on the gf6600gt video card at a speed of 58 Mb/s. It is said that the new G8 graphics card can save the copy step, that is, directly rendering to VBO, It is faster.
Disadvantages:
- High memory consumption.
- Additional copy actions are required.
- When multiple FBO objects are used to render an image, each object must be in the same data format. For example, if we want to render the position information in a rendering channel to the floating-point RGB buffer block and then render the normal information to the buffer in BYTE-RGB format, this would not work.
Note: The old GFX video card does not support multi-rendering.
Implementation:
Method 1: Read the texture during Vertex coloring.
Step 1:Generate FBO
Generate a FBO, and then bind a texture used to save vertex data to this FBO.
Procedure2:Generate VBO
Generate a vertex buffer (VBO) to save texture coordinate information. The main function is to correctly locate and read data in the FBO texture during Vertex coloring.
Create a vertex buffer object (VBO) Holding texture coordinates
Referencing the FBO texture (e.g. A position VBO
2 float-coordinates per vertex)
(See further down the text how to create)
Procedure3:Rendering FBO
(For details, refer to the following content)
Procedure4:Rendering VBO
Here, we first get the vertex coordinates from VBO, then use the coordinates to access the FBO texture and obtain the texture data.
The following example shows the code of a vertex program:
Code:
// vertex.position is our // index to the real vertex array !!ARBvp1.0 OPTION NV_vertex_program3; PARAM mvp[4] = { state.matrix.mvp }; TEMP real_position; TEX real_position, vertex.position, texture[0], 2D; DP4 result.position.x, mvp[0], real_position; DP4 result.position.y, mvp[1], real_position; DP4 result.position.z, mvp[2], real_position; DP4 result.position.w, mvp[3], real_position; END ;
The following lists some supported texture formats.
Vertex shader:
Quote:
Originally postedNVIDIA documentationThere are many restrictions on calling vertex textures. You must use gl_texture_2d texture pairs. Currently, only gl_luminance_float32_ati and gl_rgba_float32_ati data formats are supported, both formats only support 32-bit floating point data. The former is a single channel, and the latter is a four-channel. It is worth noting that if other texture formats or unsupported filtering modes are used, the graphics card driver may return to the software mode for vertex processing. The following code is correct. Gluint vertex_texture; Glgentextures (1, & vertex_texture ); Glbindtexture (gl_texture_2d, vertex_texture ); Gltexparameteri (gl_texture_2d, gl_texture_mag_filter, gl_nearest ); Gltexparameteri (gl_texture_2d, gl_texture_min_filter, gl_nearest_mipmap_nearest ); Glteximage2d (gl_texture_2d, 0, gl_luminance_float32_ati, width, height, 0, gl_luminance, gl_float, data ); |
Method 2. copy to the Pixel Buffer (PBO ).
Procedure1:Generate a VBO for Pixel buffering:
Example: code:
GLuint vbo_points_handle; glGenBuffersARB(1, &vbo_vertices_handle); glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, vbo_vertices_handle); glBufferDataARB(GL_PIXEL_PACK_BUFFER_EXT, vbo_points.size()*4*sizeof(float ),NULL, GL_DYNAMIC_DRAW_ARB );
Procedure2:Generate a FBO.
Multiple paintals help us write the vertex/normal/subnormal at the same time. The following is an example of FBO generation:
Code:
GLuint fb_handle; glGenFramebuffersEXT(1,&fb_handle); fbo_tex_vertices = NewFloatTex(tex_width,tex_height,0); fbo_tex_normals = NewFloatTex(tex_width,tex_height,0);
This Code demonstrates how to generate the texture of floating point data. Code:
/** * Sets up a floating point texture with NEAREST filtering. * (mipmaps etc. are unsupported for floating point textures) */ void setupTexture (const GLuint texID,int texSize_w,int texSize_h) { // make active and bind glBindTexture(textureParameters.texTarget,texID); // turn off filtering and wrap modes glTexParameteri(textureParameters.texTarget, GL_TEXTURE_MIN_FILTER, GL_NEAREST);glTexParameteri(textureParameters.texTarget, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(textureParameters.texTarget, GL_TEXTURE_WRAP_S, GL_CLAMP); glTexParameteri(textureParameters.texTarget, GL_TEXTURE_WRAP_T, GL_CLAMP); // define texture with floating point format glTexImage2D(textureParameters.texTarget,0,textureParameters.texInternalFormat,texSize_w,texSize_h,0,textureParameters.texFormat,GL_FLOAT,0); // check if that worked if (glGetError() != GL_NO_ERROR) { printf("glTexImage2D(): [FAIL] "); // PAUSE(); exit (ERROR_TEXTURE); } else if (mode == 0) { printf("glTexImage2D(): [PASS] "); } // printf("Created a %i by %i floating point texture. ",texSize,texSize); }
Note: even if we can generate RGB textures, the internal format may be rgba. When we use glreadpixels () to read data, the operation may be slowed down due to format conversion. Therefore, in most cases, we try to use the rgba format.
Procedure3:Rendering FBO
The input texture contains necessary data (such as vertex position and normal). After calculation, the data is saved to the output texture.
The following code binds the FBO Buffer:
Code:
//glBindFramebufferEXT(GL_attach two textures to FBOglFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, attachmentpoints[0], textureParameters.texTarget, outTexID, 0); // check if that worked if (!checkFramebufferStatus()) { printf("glFramebufferTexture2DEXT(): [FAIL] "); // PAUSE(); exit (ERROR_FBOTEXTURE); } else if (mode == 0) { printf("glFramebufferTexture2DEXT(): [PASS] "); }
The following code renders a quadrilateral to trigger the FBO operation. Code:
// make quad filled to hit every pixel/texel // (should be default but we never know) glPolygonMode(GL_FRONT,GL_FILL); if (textureParameters.texTarget == GL_TEXTURE_2D) { // render with normalized texcoords glBegin(GL_QUADS); glTexCoord2f(0.0, 0.0); glVertex2f(0.0, 0.0); glTexCoord2f(1.0, 0.0); glVertex2f(outTexSizeW, 0.0); glTexCoord2f(1.0, 1.0); glVertex2f(outTexSizeW, outTexSizeH); glTexCoord2f(0.0, 1.0); glVertex2f(0.0, outTexSizeH); glEnd(); } else { // render with unnormalized texcoords glBegin(GL_QUADS); glTexCoord2f(0.0, 0.0); glVertex2f(0.0, 0.0); glTexCoord2f(outTexSizeW, 0.0); glVertex2f(outTexSizeW, 0.0); glTexCoord2f(outTexSizeW, outTexSizeH); glVertex2f(outTexSizeW, outTexSizeH); glTexCoord2f(0.0, outTexSizeH); glVertex2f(0.0, outTexSizeH); glEnd(); }
Gluortho2d is required! If no, glreadpixels will encounter an error during runtime.
Code:
/** * Creates framebuffer object, binds it to reroute rendering operations * from the traditional framebuffer to the offscreen buffer */ void initFBO(void) { // create FBO (off-screen framebuffer) glGetIntegerv(GL_DRAW_BUFFER, &_currentDrawbuf); // Save the current Draw buffer glGenFramebuffersEXT(1, &fb); // bind offscreen framebuffer (that is, skip the window-specific render target) glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fb); // viewport for 1:1 pixel=texture mapping glMatrixMode(GL_PROJECTION); glLoadIdentity(); gluOrtho2D(0.0, outTexSizeW, 0.0, outTexSizeH); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); glViewport(0, 0, outTexSizeW, outTexSizeH); }
Procedure4:Copy FBO data to PbO
Example:
Code:
/** *Copy from FBO to PBO * */ void copyFromTextureToPBO(GLuint pboID,int texSize_w,int texSize_h) { glReadBuffer(attachmentpoints[0]); glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, pboID); glReadPixels(0, 0, texSize_w,texSize_h,textureParameters.texFormat,GL_FLOAT, 0);glReadBuffer(GL_NONE); glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, 0 ); }
(Vbo_vertices.size () = tex_width * tex_height)
Procedure5:Rendering VBO:
Code:
glBindBufferARB(GL_ARRAY_BUFFER_ARB, vbo_vertices_handle); glEnableClientState(GL_VERTEX_ARRAY); glVertexPointer ( 4, GL_FLOAT,4*sizeof(float), (char *) 0); glBindBufferARB(GL_ARRAY_BUFFER_ARB, vbo_normals_handle); glEnableClientState(GL_NORMAL_ARRAY);glNormalPointer(GL_FLOAT, 4*sizeof(float), (char *) 0 ); glDrawArrays( GL_TRIANGLES, 0,vbo_vertices.size() ); glDisableClientState(GL_NORMAL_ARRAY); glDisableClientState(GL_VERTEX_ARRAY);
Example
The example in this article implements the function of calculating the B-Spline in the GPU. The technologies used include VBO, FBO, render to vertex, CG, and B-spline. implementation process:
There are three phases:
Phase 1: GPU fragment coloring to generate FBO vertex data.
Sends the data of the spline control point to an input texture (Control Point texture) of the GPU ).
Read the data in the controlled "dot texture" in the fragment processing unit, use the B-spline interpolation function to calculate the interpolation vertex, and save the result to the output texture (interpolation texture) bound to FBO.Stage 2: Copy FBO to PbO
Use the glreadpixels () function to copy the interpolation texture to PBO.
Stage 3: rendering VBO
Use gldrawarrays (); to render the spline. Of course, we need to specify the previously generated PbO data as a VBO image.
The interpolation operation and data copying in the whole process are performed in the GPU. The final vertex data is directly rendered using the vertex array, and the data is not returned to the CPU, so the speed will be very fast.
Conclusion
Statement:
This translation can be freely reproduced and requires that the original author information be retained, and the article is from the physical development network: www.physdev.com
The Code provided in this article passes the nv6600 video card test. If you have any problem with the video card test, you can go to the Physical Development Network (www.physdev.com) in the gpgpu/Cuda forum.
The next step is to write a Gup Particle System example, Gup cloth system example, and GPU hair system example.
Reference
Http://oss.sgi.com/projects/ogl-sample/registry/EXT/pixel_buffer_object.txt
Http://developer.nvidia.com/object/using_vertex_textures.html
Http://wiki.delphigl.com/index.php/GLSL_Partikel
Http://www.mathematik.uni-dortmund.de /~ Goeddeke/gpgpu/tutorial.html # arrays3
Http://download.developer.nvidia.com/developer/SDK/Individual_Samples/samples.html
Download Sample Code: Http://www.physdev.com/phpbb/viewtopic.php? T = 144