NV_texture_barrier

barthold · September 3, 2009, 4:37pm

I’m surprised nobody has noticed this yet in the OpenGL 3.2 beta drivers. You guys are slacking

http://developer.download.nvidia.com/opengl/specs/GL_NV_texture_barrier.txt

Only supported on 190.57 so far. Have fun!

Barthold
(with my NVIDIA hat on)

Alfonse_Reinheart · September 3, 2009, 5:35pm

It took me a little while to figure out the point of that extension, what with NV_fence and ARB_sync. But it’s about the cache invalidation, isn’t it? Well, that, and that the barrier (probably?) does not halt until the GPU has executed the previous commands.

So basically, you can guarantee that reading and writing to the same image works as long as individual draw calls are separated by one of these barriers, and that draw calls not so separated never read and write from/to the same pixels.

I’m not quite sure how you get fragment program blending if you can’t read and write to the same pixel during the fragment program.

ScottManDeath · September 3, 2009, 6:01pm

Cool. Works like a charm. Now finally I won’t get those artifacts, that sometimes appear.

I am doing volume rendering using geometry shaders/texture arrays and ping pong to do blending/blurring. For the first few slices, I get read/write hazards, but only if I use MRT to accelerate my computation. If I use multiple passes (which involves attaching different textures to the same FBO, I will not get the artifacts. This is in practice not a problem though, since the first few slices are at the boundaries of the volume, which typically is transparent through the transfer frunction. However, it is good to have the option to make it correct. =)

Perf hit is roughly 10 to 15%, when calling glTextureBarrierNV 1500 times, once after each rendered slice. I could possibly get away with only calling it for the first dozen drawcalls.

Ilian_Dinev · September 4, 2009, 1:11am

Tried it, and it works perfectly. Can finally do custom blending as long as fragments from the same draw-call don’t overlap (problem easily solved for some effects with a depth-only pass for that draw-call). Can let ping-pong use only one texture, too. Thanks!

ScottManDeath · September 4, 2009, 9:48am

What about a way of autinatically executing a glTextureBarrierNV after each instance has been rendered using any of the glDraw*Instanced calls? That way, I could do all my rendering in a single draw call, w/o having to submit my 2k drawcalls individually, with corresponding glTextureBarrierNV.

Alternatively, one could also specify an index, which will trigger a glTextureBarrierNV, similar to the NV_primitive_restart extension. This would also help to reduce the CPU overhead for a complex render target ping-ping cass of applications.

Alfonse_Reinheart · September 4, 2009, 10:23am

What about a way of autinatically executing a glTextureBarrierNV after each instance has been rendered using any of the glDraw*Instanced calls?

Egh. OpenGL already has enough glDraw* calls. I get the idea and the intent behind it. But OpenGL glDraw* calls aren’t expensive enough to need something like that.

ScottManDeath · September 4, 2009, 2:55pm

I agree, more clutter, is to avoid. But having a specicial index to trigger a glTextureBarrierNV will accomplish it, w/o having to add new draw calls. This would also handle the instancing case.

Alfonse_Reinheart · September 4, 2009, 3:37pm

Primitive restart functionality was not added due to API convenience. It was added because hardware supported it. I highly doubt hardware supports having an index trigger a texture and framebuffer cache flush. And doing it in software would likely be even slower than multiple glDraw* calls.

glDraw* calls are fairly fast as is, so it isn’t a big deal.