RTT/pbuffer questions

So I’ve read the RTT specs a couple of times but it still leaves me confused. Issue 10, and the usage of CopyTexSubImage is of most concern.

Basically I want to implement this functionality(fragment program):
textureA <= textureA OP textureB OP textureC

The specs clearly states I cant have a texture bound to the pbuffer when rendering to it. First I thought I might use a double buffered pbuffer and bind the texture to one of them, but issue 14 says it’s a no no.

So my questions are, can I render to the pbuffer and make a CopyTexSubImage call to copy the contents to textureA, when textureA is a floating point texture? If textureA is bound to the pbuffer at the time of CopyTexSubImage will this mess up everything? Or, should I have another texture, lets say A2, and copy from the pbuffer(bound to A) to A2?

Like this,
textureA <= textureA2 OP textureB OP textureC

I wanted to see if someone had tried this before, before I start coding.
All help/thoughts greatly appreciated.
/Mathias

The specs clearly states I cant have a texture bound to the pbuffer when rendering to it. First I thought I might use a double buffered pbuffer and bind the texture to one of them, but issue 14 says it’s a no no.

The spec doesn’t forbid you doing this, it just says results are undefined. And, as live proves, you may ignore both restrictions. Issue 14 has been identified as safely ignorable. About rendering to bound pbuffer: as long as you dont sample pbuffer too far (within 1-pixel distance probably) from currently written pixel, it will work as expected. On my GF3 this worked well:

  PBuffer.MakeCurrent();
  PBuffer.ReleaseTexImage();
  PBuffer.BindTexImage();
  /* now bind pbuffer's texture to texture unit */ 
  /* and render to the pbuffer */

In one GDC’03 presentation Nvidia teaches about example of use of the technique - this should be enough to stop you worrying about that spec violation .

So my questions are, can I render to the pbuffer and make a CopyTexSubImage call to copy the contents to textureA(…)

I have bad experiences with this. I had pbuffer created with RTT flags. When i tried to CopyTexSubImage from it, my app slowed down to single digit FPS.

You may try this (not tested):
Let’s assume you render to 640x480 window.
Create one double buffered 640x960 RTT pbuffer. When rendering to it, split it logically in half, using viewport & scissor. This will give you 4 full-screen-size render targets, all sharing single depth+stencil buffer. This sharing should be critical for performance (I can’t be sure, but I’d bet that switching depth render target flushes and wastes any “early Z culling”, “hierarchical Z compression” etc. hardware). The drawback is that you will have to render depth twice (once for upper and once for lower halves). And let’s hope the scissor won’t introduce any clipping problems.

Originally posted by MZ:
In one GDC’03 presentation Nvidia teaches about example of use of the technique - this should be enough to stop you worrying about that spec violation .

You dont say, thanx, have to try that.
Maybe I should stop being so over cautious at times

The drawback is that you will have to render depth twice (once for upper and once for lower halves). And let’s hope the scissor won’t introduce any clipping problems.
That sounds like a good idea, especially since I dont use depth testing.

Cool, lots of new ideas.Thanks again!

If those operations commute, you can use blending.