I’m experiencing a huge performance drop (~40x) when copying data from a pbuffer to a texture when using a couple of floating point internal formats.
Because I’m trying to keep the code portable, I’m using copytexsubimage rather than render-to-texture. Regardless, I have also tried out using render-to-texture in windows with the same results. I also tried different drivers for both linux and windows using a GF6800-GT.
First I tried an fp rgba texture rectangle and a pbuffer with rgba nv float components. This works fine.
Because I’m writing a realtime gpgpu application, any increase in performance is welcome. I only needed one fp channel, so in order to decrease the memory bandwith I used an fp red texture rectangle and a pbuffer with a red nv float component. Again this works fine and I got a noticable performance speedup due to the decreased bandwidth.
Finally I arrived at the point where the application needs texture filtering and mipmaps, so I decided to use ati fp textures. I created an fp rgba texture and a pbuffer with rgba ati float components. Again this works fine.
Note: this also works when specifying a pbuffer with rgba nv float components.
Then I tried to reduce the bandwidth again. I created an fp intensity texture and a pbuffer with rgba ati float components. If I understood the spec correctly it should extract the intensity from the red color channel. This is where I experienced the huge performance drop.
Also in the document about NVIDIA OpenGL Texture Formats , I noticed that nvidia hardware performs a precision substitution for ati_float_rgb16 to ati_float_rgba16. So I thought specifying an fp rgb texture and a pbuffer with rgba ati float components would result in the same speed as an fp rgba texture. But still I experience the same huge performance drop.
How do I setup the pbuffer to allow a fast copy of only one fp channel to an ati_float_texture?