Any way to do depth testing per fragment?


I’m drawing a bunch of fuzzy lines in 3D space, and they can cross and re-cross each other so that it isn’t possible to sort them by depth. If one line is drawn closer to the camera, then another line passes behind it, that second line is not drawn properly as it passes underneath the first line. The second line fails the depth test where the fuzzy edges of the first line are.

Is there any way to check the depth values from within the fragment shader? It seems like this should be possible but maybe not, due to the parallel nature of fragment shaders. If I could compare the two depths of the fragments, then I could blend them appropriately. It wouldn’t work once you had a lot of overlapping translucent values, but it would probably be good enough.



Not easily. Or efficiently. You can’t read from a texture which is bound as a framebuffer attachment. For read-modify-write, you can use image atomic operations, but there’s a performance cost due to the atomicity requirement. But at best, that’s going to let you handle two layers correctly (i.e. you can handle one fragment rendered either in front of or behind a previous fragment, but then you have to decide which of the two depths to store, and the choice will affect the handling of a third fragment which lies between the first two).

Common approaches to order-independent transparency are “depth peeling” (which involves multiple render passes, each processing the Nth-nearest (or farthest) fragment), or maintaining a linked list of fragments for each pixel.

Thanks – that’s what I was afraid of. Do you know of any good references to the linked-list approach? How is that different from read-modify-write? All the GPU threads would be accessing the same list, wouldn’t they?

Each pixel has its own linked list (though they are all “allocated” from the same buffer). This is managed through the use of atomics. When a shader invocation “allocates” a new node from the buffer, it does so by bumping an atomic counter. The index it gets becomes the “pointer” for that linked list node (since the buffer is just an array of nodes). It then does an atomic-swap of this “pointer” into the pixel the invocation corresponds to. It takes the “pointer” value it got from the swap and stores it in the linked list node’s “next” field.

So each invocation will have a unique linked list node, and data races as to who puts their node into the pixel first don’t matter, since the swap is atomic. And since each invocation has its own unique linked list node, they can poke at that data without caring about what other invocations are doing.

Thanks – sounds like I have some things to learn before I can tackle this approach, but I like the idea of it.