I’m looking to create “flag” textures. I’m doing this because I have a single texture, the channels of which contain three independent pieces of information which amount to a yes/no. I need to do three runs starting by looking at each, and I’d prefer not to have to look up all 128 bits each time.
The obvious thing to do is to use MRT to output “flag” textures with fewer bpp which will be quick to look up. Thus I only have to look up all 128 bits once, rather than three times.
I only need three bits…but of course, OpenGL doesn’t support textures anywhere near that small, and even 1-channel 8-bit textures aren’t color-renderable.
I know the GL_RGBA internalFormat works, and GL_FLOAT_R32_NV is the same size and would also do. Is there a smaller one?
I looked at the glBitmap function, but I don’t think it’ll help at all…it doesn’t seem directly associated with a 1-bpp texture or anything. Too bad.
Even if the stencil buffer were better-supported, I couldn’t output to it in quite the way I’d like to, so that’s not an option.
OpenGL supports R3G3B2 textures, at least on ATI.
But RGBA4 or RGB5_A1 are about as low as you can expect to render to, via FBO.
Well, the thing is that you can make a color texture of all those internal formats that GL supports but then when you check if you FBO is complete, it will fail.
So your need to setup some code that can try from low bit formats up to highbit formats, until the code succeeds in creating it.
R3G3B2, RGBA4 or RGB5_A1 are very likely not supported. You can try a 16 bit intensity format, or 16 bit float format.
Such formats will likely be converted to RGBA8 by the driver, NVidia has published a paper with all supported pixel formats, it’s a very interesting read.
Here ya go (near bottom of page on left):
If you are doing this render operation infrequently (say once at load or something), and your render target setup wont support anything smaller than RGB8 (which seems likely), then you can always compress the results or change format after the render op. Download the data with glGetTexImage then re-upload it with a different format - say using S3TC or something can reduce the size several fold (6x for DXT1 i think). The compression might change some values a bit - so you may not be able to expect exactly the values you put in, but for a simple 3 booleans per image it should work fine.
Sadly, that isn’t an option in this case. It’s merely one phase of a sequence of numerous GPGPU-style renders, and we’re minimizing readback between them as much as possible.
Perhaps glCopyTexSubImage, though. I don’t know how that works between formats, but it might be worth a shot…
What about using for example 16bit integer format and storing multiple data in one pixel instead of only three values per RGB(say 16x3). I dont exactly know what you are trying to do so it might not be useful at all.
Second though, using somehow stencil buffer, if you need to know in pixel shader value of the “flag” and based on this do one thing or another, try something like using one pixel shader for stencil where is true(1) and another when there is false(0). It can actually profit from early stencil, so no need for dynamic branching in shader.
The code I’m writing is desired to work on both Windows and Linux, and at one time in the past we discovered that attempting to use a stencil buffer caused the Linux drivers to spit out NaNs into the color buffer. Which is bad. So, I’m staying away from the stencil buffer for now.
Integer formats aren’t available since I’m not on a G80 card. However, the notion of packing multiple flags into a single texel is interesting. I’ll consider that. It might improve cache hits.
NVidia, Quadro 4500.
We didn’t really delve into the problem all that much at the time. FYI, it was using GL_DEPTH24_STENCIL8_EXT with an FBO.