I read back my rendered images using alternating PBOs. Although I render to a RGBA (BGRA) renderbuffer, I need an 8-bit RGB image in the end.
I thought this was easy, I only need to use GL_BGR_EXT instead of GL_BGRA_EXT in the glReadPixels call like this:
It works, but my great surprise, it can slow down my render thread quite considerably. Sometimes CPU utilization on that particular core can get above 50% even though there is not a lot going on. Once I change GL_BGR_EXT back to GL_BGRA_EXT, that ~50% falls back to something like 2%.
Is it possible that reading back in a similar but different format than what the buffer originally was defined is such a great deal? Am I better off writing my own code that would convert from BGRA to BGR?
Unfortunately you should write your own conversion function. Drivers are not very good in this simple task.
To add to/clarify what mfort said, you appear to be using a PBO here (since you use a “pointer” of 0, which is NULL and illegal for pointers).
The best case scenario for PBOs, the one that are optimized for, is when the GPU is just doing a pure DMA from the texture to the buffer object. It may have to do some internal “swizzling,” but even that’s usually part of the hardware memcopy.
The moment you try to touch the format of the pixels, you’re in trouble. That has to be done on the CPU. So now what you have is a memcopy to BGRA format, then the CPU must modify the buffer object so that it is in BGR format.
A buffer object that is likely to be in graphics memory. So now the driver must download the data to the CPU, modify it, re-upload it to the buffer object, and after all that then you can get the data.
You’re using PBOs to avoid CPU stalls when downloading. So you’re going to have to do any format conversions yourself for maximum performance.
Yes, you are right. I use PBOs. And now from what you wrote I understood why performance was hit so severely. Although I can certainly do RGBA to RGB conversion on the CPU, I still have one more question.
Would there be any advantage to add another rendering pass to an RGB texture, then read back that texture? It would add to the load on the GPU, but maybe by having to read back only 75% of the number of bytes than before might be worth it. What do you think?
Would there be any advantage to add another rendering pass to an RGB texture, then read back that texture?
Probably not. RGB8 textures are usually implemented internally as RGBA8 textures, with the Alpha permanently masked off. So you’d run into effectively the same issue when you tried to read it back.
I added conversion to RGB on the CPU side, and it works beatifully. No more problems!