ArticleDirectory
- Comparison of Different implementations of rendering texture (render to texture) in OpenGL
OpenGL Pixel Buffer object (PBO)
Related Topics:Vertex buffer object (VBO), frame buffer object (FBO)
Download:Pbounpack.zip, pbopack.zip
- Overview
- Creating PbO
- Mapping PbO
- Example: streaming texture uploads with PbO
- Example: asynchronous readback with PbO
Overview
OpenGL PbO
OpenGL arb_pixel_buffer_object extension is very close to arb_vertex_buffer_object. it simply expands arb_vertex_buffer_object extension in order to store not only vertex data but also pixel data into the buffer objects. this buffer object storing pixel data is called Pixel Buffer object (PBO ). arb_pixel_buffer_object extension borrows all VBO framework and APIs, plus, adds 2 additional "target" tokens. these tokens assist the PBo memory manger (OpenGL driver) to determine the best location of the buffer object; system memory, AGP (shared memory) or video memory. also, the target tokens clearly specify the bound PbO will be used in one of 2 different operations; gl_pixel_pack_buffer_arb to transfer pixel data to a PbO, or transferred to transfer pixel data from PBO.
for example, glreadpixels () and glgetteximage () are "pack" pixel operations, and gldrawpixels (), glteximage2d () and gltexsubimage2d () are "unpack" operations. when a PbO is bound with gl_pixel_pack_buffer_arb token, glreadpixels () reads pixel data from a OpenGL framebuffer and write (pack) the data into the PBO. when a PbO is bound with gl_pixel_unpack_buffer_arb token, gldrawpixels () reads (unpack) pixel data from the PBO and copy them to OpenGL framebuffer.
The main advantage of PBO are fast pixel data transfer to and from a graphics card through DMA (Direct Memory Access) without involing CPU cycles. and, the other advantage of PbO is asynchronous DMA transfer. let's compare a conventional texture transfer method with using a Pixel Buffer object. the left side of the following disince is a conventional way to load texture data from an image source (image file or video stream ). the source is first loaded into system memory, and then, copied from system memory to an OpenGL texture object with glteximage2d (). these 2 Transfer Processes (load and copy) are all running med by CPU.
Texture loading without PbO
Texture loading with PbO
On the contrary in the right side digoal, the image source can be directly loaded into a PbO, which is controlled by OpenGL. CPU still involves to load the source to the PBO, but, not for transferring the pixel data from a PbO to a texture object. instead, GPU (OpenGL driver) manages copying data from a PbO to a texture object. this means OpenGL performs a DMA transfer operation without wasting CPU cycles. further, OpenGL can schedule an asynchronous DMA transfer for later execution. therefore, glteximage2d () returns immediately, and CPU can perform something else without waiting the pixel transfer is done.
There are 2 major PbO approaches to improve the performance of the pixel data transfer: streaming texture update and asynchronous read-back from the framebuffer.
Creating PbO
As mentioned earlier, Pixel Buffer object borrows all APIs from vertex buffer object. The only difference is there are 2 additional tokens for pbos:Gl_pixel_pack_buffer_arbAndGl_pixel_unpack_buffer_arb. Gl_pixel_pack_buffer_arb is for transferring pixel data from OpenGL to your application, and gl_pixel_unpack_buffer_arb means transferring pixel data from an application to OpenGL. openGL refers to these tokens to determine the best memory space of a PbO, for example, a video memory for uploading (unpack) textures, or system memory for reading (pack) framebuffer. however, these target tokens are solely hint. openGL driver decides the appropriate location for you.
Creating a PbO requires 3 steps;
- Generate a new buffer objectGlgenbuffersarb ().
- Bind the buffer objectGlbindbufferarb ().
- Copy pixel data to the buffer objectGlbufferdataarb ().
If you specify a null pointer to the source array in glbufferdataarb (), then PbO allocates only a memory space with the given data size. the last parameter of glbufferdataarb () is another performance hint for PbO to provide how the buffer object will be used.Gl_stream_draw_arbIs for streaming texture upload andGl_stream_read_arbIs for asynchronous framebuffer read-back.
Please check VBO for more details.
Mapping PbO
PbO provides a memory mapping mechanism to map the OpenGL controlled buffer object to the client's address space. So, the client can modify a portion of the buffer object or the entire buffer by usingGlmapbufferarb ()AndGlunmapbufferarb ().
Void * glmapbufferarb (glenum target, glenum access) glboolean glunmapbufferarb (glenum target)
Glmapbufferarb () returns the pointer to the buffer object if success. Otherwise it returns NULL.TargetParameter is gl_pixel_pack_buffer_arb or gl_pixel_unpack_buffer_arb. The second parameter,AccessSpecifies what to do with the mapped buffer; read data from the PBO (gl_read_only_arb), write data to the PBO (gl_write_only_arb), or both (gl_read_write_arb ).
Note that if GPU is still working with the buffer object, glmapbufferarb () will not return until GPU finishes its job with the corresponding buffer object. to avoid this stall (wait), call glbufferdataarb () with Null Pointer right before glmapbufferarb (). then, OpenGL will discard the old buffer, and allocate new memory space for the buffer object.
The buffer object must be unmapped with glunmapbufferarb () after use of the PBO. glunmapbufferarb () returns gl_true if success. Otherwise, it returns gl_false.
Example: streaming texture uploads
Download the source and binary: pbounpack.zip.
This demo application uploads (unpack) streaming textures to an OpenGL texture object using PBO. you can switch to the different transfer modes (single PbO, double pbos and without PbO) by pressing the Space key, and compare the performance differences.
The texture sources are written directly on the mapped Pixel Buffer every frame in the PBo modes. then, these data are transferred from the PBo to a texture object using gltexsubimage2d (). by using PbO, OpenGL can perform asynchronous DMA transfer between a PbO and a texture object. it significantly increases the texture upload performance. if asynchronous DMA transfer is supported, gltexsubimage2d () shocould return immediately, and CPU can process other jobs without waiting the actual texture copy.
Streaming texture uploads with 2 pbos
To maximize the streaming transfer performance, you may use multiple Pixel Buffer objects. the dimo-shows that 2 pbos are used simultaneously; gltexsubimage2d () copies the pixel data from a PbO while the texture source is being written to the other PBO.
ForNTh frame,PbO 1Is used for gltexsubimage2d () andPbO 2Is used to get new texture source.N + 1Th frame, 2 pixel buffers are switching the roles and continue to update the texture. because of asynchronous DMA transfer, the update and copy processes can be performed med simultaneously. CPU updates the texture source to a PbO while GPU copies texture from the other PBO.
// "Index" is used to copy pixels from a PbO to a texture object // "Nextindex" is used to update pixels in the other PbO Index = (index + 1) % 2; If (pbomode = 1) // With 1 PbO Nextindex = index; else if (pbomode = 2) // With 2 pbos Nextindex = (index + 1) % 2; // Bind the texture and PbO Glbindtexture (gl_texture_2d, textureid); glbindbufferarb (gl_pixel_unpack_buffer_arb, pboids [Index]); // Copy pixels from PBO to texture object // Use offset instead of ponter. Gltexsubimage2d (gl_texture_2d, 0, 0, 0, width, height, gl_bgra, gl_unsigned_byte, 0 ); // Bind PbO to update texture source Glbindbufferarb (gl_pixel_unpack_buffer_arb, pboids [nextindex]); // Note that glmapbufferarb () causes sync issue. // If GPU is working with this buffer, glmapbufferarb () will wait (stall) // Until GPU to finish its job. To avoid waiting (idle), you can call // First glbufferdataarb () with NULL pointer before glmapbufferarb (). // If you do that, the previous data in PbO will be discarded and // Glmapbufferarb () returns a new allocated pointer immediately // Even if GPU is still working with the previous data. Glbufferdataarb (gl_pixel_unpack_buffer_arb, data_size, 0, gl_stream_draw_arb ); // Map the buffer object into client's memory Glubyte * PTR = (glubyte *) glmapbufferarb (gl_pixel_unpack_buffer_arb, gl_write_only_arb); If (PTR ){ // Update data directly on the mapped Buffer Updatepixels (PTR, data_size); glunmapbufferarb (gl_pixel_unpack_buffer_arb ); // Release the mapped Buffer }// It is good idea to release pbos with ID 0 after use. // Once bound with 0, all pixel operations are back to normal ways. Glbindbufferarb (gl_pixel_unpack_buffer_arb, 0 );Example: asynchronous read-back
Download the source and binary: pbopack.zip.
This demo application reads (pack) the pixel data from the framebuffer (left-side) to a PbO, then, draws it back to the right side of the window after modifying the brightness of the image. you can toggle PbO on/off by pressing the Space key, and measure the performance of glreadpixels ().
Conventional glreadpixels () blocks the pipeline and waits until all pixel data are transferred. then, it returns control to the application. on the contrary, glreadpixels () with PBO can schedule asynchronous DMA transfer and returns immediately without stall. therefore, the application (CPU) can execute other process right away, while transferring data with DMA by OpenGL (GPU ).
Asynchronous glreadpixels () with 2 pbos
This demo uses 2 pixel buffers. At FrameN, The application reads the pixel data from OpenGL framebufferPbO 1Using glreadpixels (), and processes the pixel data inPbO 2. These read and process can be synchronized med simultaneously, because glreadpixels ()PbO 1Returns immediately and CPU starts to process data inPbO 2Without delay. And, we alternatePbO 1AndPbO 2On every frame.
// "Index" is used to read pixels from framebuffer to a PbO // "Nextindex" is used to update pixels in the other PbO Index = (index + 1) % 2; nextindex = (index + 1) % 2; // Set the target framebuffer to read Glreadbuffer (gl_front ); // Read pixels from framebuffer to PbO // Glreadpixels () shocould return immediately. Glbindbufferarb (gl_pixel_pack_buffer_arb, pboids [Index]); glreadpixels (0, 0, width, height, gl_bgra, gl_unsigned_byte, 0 ); // Map the PBo to process its data by CPU Glbindbufferarb (records, pboids [nextindex]); glubyte * PTR = (glubyte *) glmapbufferarb (records, records); If (PTR) {processpixels (PTR ,...); glunmapbufferarb (gl_pixel_pack_buffer_arb );} // Back to conventional pixel operation Glbindbufferarb (gl_pixel_pack_buffer_arb, 0 );
PbO, that is, Pixel Buffer object is also used for GPU expansion (arb_vertex_buffer_object ). The cache here is, of course, the GPU cache. PbO is similar to VBO extension, except that it stores pixel data rather than vertex data. PbO borrowed the VBO framework and all API functions, and added the two "targets" flag. The two identifiers are:
- Gl_pixel_pack_buffer_arbTransmit pixel data to PbO
- Gl_pixel_unpack_buffer_arbObtain pixel data from PBO
The "pack" or "unpack" here can be understood as "pass" and "get" respectively ". They can also be understood as "copying", that is, the"Transfer".
For example, glreadpixel refers to data from the frame buffer to the memory (memory), which can be understood as "pack"; gldrawpixel refers to the memory to the frame cache and can be understood as "unpack "; glgetteximage is a texture object to memory, which can be understood as "pack". glteximage2d can be understood as "unpack" from memory (memory) to texture object (texture object ".
It is the transfer between PbO, framebuffer, and text objects.
PbO and FBO (2) "alt =" VBO,PbO and FBO (2) "src =" http://static11.photo.sina.com.cn/bmiddle/4062094e45468a66af93a "real_src =" http://static11.photo.sina.com.cn/bmiddle/4062094e45468a66af93a ">
Figure 1 OpenGL PbO
The advantage of using PbO is fast pixel data transmission, which uses a technology called direct memory access without CPU intervention. Another advantage of PbO is that such DMA is asynchronous. We can use the following two images to compare the process of using PbO with the traditional texture transfer.
Figure 2 is the process of loading image data from an image source (such as a file or video) to a texture object using a traditional method. Pixel data is first stored in the system memory, and then glteximage2d is used to copy the data from the system memory to the texture object. The two sub-processes must be executed by the CPU. In Figure 3, we can see that pixel data is directly loaded into PbO, and this process still needs to be executed by the CPU, however, the GPU executes the DMA process from the data PbO to the texture object without the CPU involvement. In addition, OpenGL can arrange asynchronous DMA, without the need to transmit pixel data immediately. Therefore, in contrast, glteximage2d in Figure 3 returns immediately rather than immediately, so that the CPU can perform other operations without waiting for the completion of pixel data transmission.
PbO and FBO (2) "alt =" VBO,PbO and FBO (2) "src =" http://static13.photo.sina.com.cn/bmiddle/4062094e45468b475b17c "real_src =" http://static13.photo.sina.com.cn/bmiddle/4062094e45468b475b17c ">
Figure 2 loading without PbO textures
PbO and FBO (2) "alt =" VBO,PbO and FBO (2) "src =" http://static8.photo.sina.com.cn/bmiddle/4062094e45468b4d13527 "real_src =" http://static8.photo.sina.com.cn/bmiddle/4062094e45468b4d13527 ">
Figure 3 load with PbO texture
Gl_pixel_pack_buffer_arb is used to transmit pixel data from OpenGL to applications.Program, Gl_pixel_unpack_buffer_arb transmits pixel data from the application to OpenGL.
Generate PbO
PbO generation is divided into three steps:
1. UseGlgenbuffersarb ()Generate cache objects;
2. UseGlbindbufferarb ()Bind a cache object;
3.UseGlbufferdataarb ()Copy pixel data to the cache object.
If a null pointer is given to the source array in the glbufferdataarb function, PbO only allocates a memory space of a given size. Another parameter of glbufferdataarb is about PbO performance parameters (hint), indicating how to use cached objects. This parameter is usedGl_stream_draw_arbIndicates loading,Gl_stream_read_arbIt indicates asynchronous frame cache reading.
Pbing PbO
PbO provides a memory ing mechanism for cache objects controlled by OpenGL to the client address space. Therefore, the client can use glmapbufferarb () and glunmapbufferarb to modify part of or whole data of the cached object.
Void * glmapbufferarb (glenum target, glenum access );
Glboolean glunmapbufferarb (glenum target );
Glmapbufferarb returns the pointer to the cached object. If the target parameter is set to ignore or ignore, access can perform ing caching. gl_read_only_arb, gl_write_only_arb, and gl_read_write_arb can be read from PBO and written to PbO, respectively, it can be read from PBO or written to PBO.
Note: If the GPU is operating on the cached object, glmapbufferarb will not return the cached object until the GPU finishes processing the cached object. To avoid waiting, call glbufferarb (using the NULL pointer as the parameter) before calling glmapbufferarb. In this case, OpenGL discards the old cache object and allocates space for the new cache object.
After the client uses PbO, glunmapbufferarb should be called to cancel the ing. Glunmapbufferarb returns gl_true, indicating success; otherwise, gl_false is returned.
____________________________________________________
Demo
The example program pbounpack.zip uses different methods to compare the pattern of passing texture to OpenGL:
- Use a PbO;
- Use two PbO;
- Do not use PbO;
You can switch between different modes by pressing the Space key.
In PbO mode, the texture source (pixel) of each frame is directly written in the PBo ing state. Then, call gltexsubimage2d to pass the pixels in PbO to the texture object. By using texture objects, asynchronous DMA transmission can be performed between PbO and texture objects. It can greatly improve the performance of pixel transfer.
Because gltexsubimage2d returns immediately, the CPU can directly perform other work without waiting for actual pixel transfer.
PbO and FBO (2) "alt =" VBO,PbO and FBO (2) "src =" http://static16.photo.sina.com.cn/bmiddle/4062094e45749e9b2d5ff "real_src =" http://static16.photo.sina.com.cn/bmiddle/4062094e45749e9b2d5ff ">
Figure 4 two PbO updates texture
To maximize the performance of pixel transfer, multiple PbO objects can be used. Figure 4 shows that two PbO instances are used at the same time. While gltexsubimage2d copies pixel data from PBO, another pixel data is written into another PBO.
In the nth frame, pbo1 is used for gltexsubimage2d, while pbo2 is used to generate a new texture object. At n + 1 frame, two PbO roles are exchanged. Due to asynchronous DMA transmission, pixel data can be updated and copied simultaneously, that is, the CPU updates the texture source to PbO, And the GPU copies the texture from another PBO.
PbO and FBO (2) "alt =" VBO,PbO and FBO (2) "src =" http://static15.photo.sina.com.cn/bmiddle/4062094e4574a05d0b11e "real_src =" http://static15.photo.sina.com.cn/bmiddle/4062094e4574a05d0b11e ">
The example program pbopack.zip reads (pack) pixel data from the left side of the window to PBO. After changing its brightness, it is drawn on the right side of the window. By pressing the Space key, you can see the performance of glreadpixels.
Traditionally, glreadpixels blocks rendering pipelines until all pixel data is transferred back to the application. On the contrary, glreadpixels of PBO can be used to schedule asynchronous DMA transmission and return immediately without waiting. Therefore, the CPU can perform other processing when OpenGL (GPU) transmits pixel data.
PbO and FBO (2) "alt =" VBO,PbO and FBO (2) "src =" http://static16.photo.sina.com.cn/bmiddle/4062094e45749f1f9ad1f "real_src =" http://static16.photo.sina.com.cn/bmiddle/4062094e45749f1f9ad1f ">
Figure 5 use two PbO asynchronous glreadpixels
The example program also uses two PBO. At the nth frame, the application frame cache reads the pixel data to pbo1 and processes the pixel data in PbO. The read and write processes can be performed at the same time because the system returns immediately when glreadpixels is called, and the CPU immediately processes pbo2 without delay. In the next frame, the roles of pbo1 and pbo2 are exchanged.
PbO and FBO (2) "alt =" VBO,PbO and FBO (2) "src =" http://static8.photo.sina.com.cn/bmiddle/4062094e4574a08112017 "real_src =" http://static8.photo.sina.com.cn/bmiddle/4062094e4574a08112017 ">
Comparison of Different implementations of rendering texture (render to texture) in OpenGL
[Align = center] [B] [size = 6] render to texture) [/size] [/B] [/align] [size = 3] [/size]
[Size = 3]
Rendering to texture implementation:
1. render the image to the frame cache, use glreadpixels to read the required parts into the customer memory, and then use the glteximage () function to create the texture.
[/Size]
[Size = 3] disadvantage: relatively slow
2. render the image to the frame cache, and then use glcopyteximage () to directly create a texture from the frame Cache
3. render the image to the frame cache, and then use glcopytexsubimage () to read the desired part from the frame cache to update the part of the texture.
4. Use pbuffer to directly render the texture [/size]
[Size = 3]
Required extensions:
Wgl_arb_extensions_string
Wgl_arb_render_texture
Wgl_arb_pbuffer
Wgl_arb_pixel_format
Disadvantages:
It can only be used on Windows.
Every pbuffer works in different OpenGL contexts, causing management trouble
Large overhead for pbuffer Switching
Use the DC and RC of pbuffer as the rendering device and rendering context for rendering.
5. Use framebuffer object (FBO) Extension: gl_ext_framebuffer_object [/size]