glTexImage2D is too slow for rendering 4K RGB24 data

Chakravarthi · March 26, 2019, 2:35pm

Hi All,

I’m using opengl for in my application for rendering RGB24 4K data. I’m using glTexImage2D for creating/preparing image array with below paramters

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB8, camera->width, camera->height, 0, GL_BGR, GL_UNSIGNED_BYTE, buffer);

Above function takes 20 milliseconds, because of this I can not render @60FPS. Below is snap shot the code for RGB24

/******* Init part **********/
initializeOpenGLFunctions();
glEnable(GL_DEPTH_TEST);
glDisable(GL_DEPTH_TEST);
glEnable(GL_TEXTURE_2D);
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);

/******* Render part **********/
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, texture);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB8, camera->width, camera->height, 0, GL_BGR, GL_UNSIGNED_BYTE, g_buffer); //RGB24 working
glBegin(GL_TRIANGLE_STRIP);
glTexCoord2f(0, 1);
glVertex2f(-1, -1);
glTexCoord2f(1, 1);
glVertex2f(1, -1);

glTexCoord2f(0, 0);
glVertex2f(-1, 1);
glTexCoord2f(1, 0);
glVertex2f(1, 1);
glEnd();
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);

Question:

How to modify the above code to support RGB24 with Fragment and vertex shader to reduce 20milli seconds to ~5 milliseconds.
I want use the shader to convert RGB24 to RGB32 and then use below function to render RGB24.

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, camera->width, camera->height, 0, GL_BGRA, GL_UNSIGNED_BYTE, buffer);

GClements · March 26, 2019, 8:39pm

I suspect that much of the performance hit is from synchronisation. The implementation can’t overwrite a texture’s data while there are pending rendering commands which will use that data.
Either use two textures and alternate between them, or use a pixel buffer object (PBO). For the latter, upload the data to a buffer object then bind it to GL_PIXEL_UNPACK_BUFFER before calling glTexImage2D.

Also: see the wiki page for Buffer Object Streaming for tips on avoiding synchronisation.

Why use a shader? Create the texture with an internal format of GL_RGBA8 and upload GL_BGR data to it.

Chakravarthi · March 27, 2019, 9:20am

Either use two textures and alternate between them

Create the texture with an internal format of GL_RGBA8 and upload GL_BGR data to it.

I’m new to opengl, I’m not sure how to use two texture and alternate it. Could you please help me regarding this.

Dark_Photon · March 27, 2019, 11:51am

Read this: Circular Buffer.

With 2 items in your buffer, this is called “ping-pong buffering”.

If you can do it with one, you can do it with 2 (or more). Just create multiple items (textures in this case), and then on successive frames, use the next item (texture) rather than immediately re-use the texture you just used. Wait a frame to render with it though.

Taking a step back to your original problem though, have you taken a close look at your system to see if you can reasonably expect it to deliver 4K 60Hz uploading frames uncompressed from the CPU to the GPU. That’s about 1.85 GB/sec sustained throughput, which is a significant fraction of a PCIe x16 link (v1-v3). What GPU and bus are you using? Also, can your frame source provide input frames consistently at that rate? Also, do you have an option to provide other input formats that are more bandwidth-efficient? Those may be options as well.

To your original proposal though, beyond GClements recommendations, I would suggest you stop reallocating the texture storage for your every frame you upload to the GPU. Pre-allocate the storage with glTexStorage2D (or glTexImage2D). Then when uploading new frames, merely upload new content into the existing texture storage with glTexSubImage2D. Also, give the driver a little time to upload one of your video frames to the texture before you tell it to render with it (i.e. Upload video frame N on draw frame N, but wait to render with it until draw frame N+1). Finally, query your GL driver to make sure you are using the most efficient GL internal texel formats and texel upload formats for it. You can do this with glGetInternalformat. For instance, try this for a few GL internal formats that you’re thinking about using:

  GLint supported, preferred, format, type;

  glGetInternalformativ( target, intFormat, GL_INTERNALFORMAT_SUPPORTED , 1, &supported );
  glGetInternalformativ( target, intFormat, GL_INTERNALFORMAT_PREFERRED , 1, &preferred );
  glGetInternalformativ( target, intFormat, GL_TEXTURE_IMAGE_FORMAT     , 1, &format );
  glGetInternalformativ( target, intFormat, GL_TEXTURE_IMAGE_TYPE       , 1, &type   );

Be sure to try both GL_RGBA8 and GL_RGB8 formats. Some GPUs prefer the former over the latter.