Realtime updating texture content CPU<>GPU

enquel · June 24, 2022, 7:59pm

Still new to OpenGL. Say I want to procedurally update the content of my texture using CPU algorithm and have as fast as possible update on the texture put on a model. What is the best approach? Can this all be done with FBO? I would say no since from what I think is right all computation must be done on a frag shader. Then again, glTexImage2D takes pointer as byte input. I’ve read glGetTexImage2D is slow, though it would certainly do the job.

GClements · June 25, 2022, 5:18am

Ideally, you’d update the texture using the GPU. If you need to modify an in-use texture from the CPU or read texture data back from the GPU, it’s preferable to use a pixel buffer object (PBO). In the latter case (GPU->CPU), use a fence to avoid stalling the CPU while waiting for the data to become available.

enquel · June 25, 2022, 4:57pm

Thank you very much @GClements! In case of OpenGL ES there is not other way for CPU-processed texture to be upated any other way than by glGetTexImage2D(), yes?

GClements · June 25, 2022, 5:48pm

OpenGL ES itself doesn’t have any features which desktop OpenGL lacks. ES 3.2 has both PBOs and fences; ES 2 is much weaker than desktop OpenGL. There may be alternatives using EGL images which are faster than a texture read-modify-write cycle, but I’m not particularly familiar with EGL.

enquel · June 25, 2022, 7:02pm

@GClements thank you very much for valuable information!

mhagain · June 27, 2022, 1:31pm

For this kind of requirement I would suggest permanently keeping a copy of the texture data in system (CPU-side) memory, and working on that; then update it to GPU memory when required.

This would enable you to avoid having to readback from the GPU while a texture is potentially in-use, and allow you to structure your code so that you may be able to decouple this update from dependencies on your rendering.

In the best case you wouldn’t even need a PBO or delayed frames with this kind of setup; a more intelligent driver may be able to detect the update pattern you’re using and optimize accordingly; e.g. by making a second copy of the texture, then swapping them out at the appropriate time (i.e. similar in concept to buffer orphaning). If you don’t want to reply on the driver being able to do this, it’s relatively simple to implement yourself in code (it’s basic double-buffering, but just with textures).

The critical bottleneck then becomes texture updating, and here you need to be careful to balance the natural desire to save bandwidth and save memory, vs the requirement to do the upload in as close to the preferred internal format as possible, so that the driver doesn’t need to do a (potentially very slow) software format conversion for you. When I last benchmarked this (over 10 years ago, on much older drivers) it was worth a 40x overall performance improvement in some cases - so it was certainly worth investing time and effort in.

enquel · July 1, 2022, 8:40pm

@mhagain thank you very much for explanation!