right way to check for hardware accelerated floating point blending


what’s the right way to check if a given hardware can do accelerated 16 bit floating point blending?

AFAIK there is no extension I could query to check for that, right? So, shouldn’t glEnable(GL_BLEND) set an error if it isn’t HW-supported? (or does that trigger a software fallback)?

The nvidia 62xx don’t support, the 66xx and 68xx do support as far as I know. Is it safe to check the GL_RENDERER string? I don’t consider that a good option since that will fail for future unknown RENDERERS.

We had a thread on this a couple of weeks ago, I’m in the same position of having to do a renderer id check, since there doesn’t seem to be a way to access this info. Here’s the thread:

Float blending thread

Though the spec doesn’t mention blending and bilinear, and so by omission they’re required to work, the lack of any conformant hardware (even the 6800 only handles these in 16 bit float, not 32) makes me think this is a bug in the spec.

I’d love to be able to do
glGetIntegerv(GL_MAX_FLOAT_BILINEAR_DEPTH_EXT, &maxFloatBilinearDepth);
glGetIntegerv(GL_MAX_FLOAT_BLENDING_DEPTH_EXT, &maxFloatBlendingDepth);


Your only option is to make a small test on startup and see what performance you get. If it’s not good enough, you can choose your non-blending path.

another way is to dont use floating point blending, instead draw to texture(rectangle) and then read in on current position in the last pass and do the blending on your own in fragment program. If you want you can do full 32bit blending and I think it isnt slower than blending(or is?). This way you could do float blending on NV3x HW.


I think your approach would be slower than native float16 blending. But I don’t know how much. You’ll always get an additional texture (the last pass output) as input to your shader that has to be read (otherwise the framebuffer has to be read, so that’s not really a disadvantage). But if you render to float32 targets you’ll cause double the bandwidth than rendering to float16 target framebuffers even without blending.

Did anybody do detailed benchmarks? It would be very interesting how much the difference is.

A few day before I worked on a HDR rendering of scene to float16 buffer on GF FX HW. I made it through FBO, rectangle textures and so on. One disadvantage of float buffers I think is the restriction with using shader instead a very simple ambient pass. In my program I had a three lights with full ambient, diffuse, specular and attenuation withch isnt so fast with normal rendering; with float16 buffer without the last tonemaping was about 25% slower. The blending is ofcourse slower, when you make full 32bit blending instead of 16bit. Someone on this board posted that the blending operation is computed in memory controler, which can be true. I do it instead in fragment program which costs extra performance.