Hi,
I would like to use the conservative depth options on the fragment shader.
Ideally I want to depth cull the fragments to stop the fragment shader doing the work. The issue is I also need to use discard and other operations (like image[Load|Store]()) in the the fragment shader. I have added layout (depth_unchanged) out float gl_FragDepth; to the GLSL and never write to gl_FragDepth. But I can see that the fragment shader is still being run when it should be depth culled.
Do I need to set anything else to enable conservative depth?
Is there other operations that will disable the conservative depth that I need to be aware of?
I did fine a Vulkan reference to if the shader also writes to storage resources!
I have been testing this code against both Windows AMD and nVidia GPUs. Both showing the same behaviour. So I assume I have missed something.
Thanks.
Conservative depth is a way to tell the hardware that your shader won’t modify the fragment’s depth value in a way that would change the results of the depth test. However, that’s all it does.
Discarding a fragment must still discard fragment writes to the depth and stencil buffers. Which means that those tests and writes must not have happened before the fragment shader executes. So using discard in your shader will shut off early depth tests no matter what.
The tool you want is not conservative depth. You want to force early depth tests on in the fragment shader, which changes the meaning of discard in a fragment shader. That is also a GL 4.2 feature, like conservative depth.
So what is the point of telling the GPU that? Why or when should you use it?
The way I read the advantages was so an early-Z-cull could happens but leave the late-Z-write. Which is what I am after. I am not even wanting to change the z-value used.
If I enable early depth testing, I get early-Z-writes, which then causes all the primitives z-values to be updated. This has the desired effect of culling the fragment processing of all the later fragments. But also rejects deeper/later fragments behind the earlier discarded fragments.
The point of conservative depth is that it means that the implementation can sometimes determine the result of the depth test without executing the fragment shader. If it can determine that the depth test will fail based solely upon the interpolated value, it can avoid executing the fragment shader. If the depth test fails, the fragment is discarded and the depth buffer isn’t updated, so the actual value written to gl_FragDepth is irrelevant.
E.g. if the glDepthFunc setting is GL_LESS, glFragDepth has layout(depth_greater), and the interpolated value is greater than the value in the depth buffer, then you know that the depth test fails without having to execute the fragment shader to get the actual value written to glFragDepth. Because you know that it’s greater than the interpolated value and that any value greater than the interpolated value will result in the depth test failing.
Unless I’ve missed (or misunderstood) something, conservative depth test is only useful if the qualifier on glFragDepth has the opposite sense to glDepthFunc. Because that’s the only case where a failure based upon the interpolated value guarantees a failure based upon the shader-calculated value. Note that correctly predicting that the depth test passes doesn’t help you because in that case you end up running the fragment shader so you can just use the calculated value.
The bottom line is that you can’t “optimise” a conditional discard, as you don’t know whether the fragment is discarded without running the fragment shader. The early_fragment_tests qualifier isn’t an optimisation, it’s a mechanism to assert that you want the performance benefits even if it means the depth and/or stencil buffers get updated when the fragment shader executes a discard. Optimisation implies that you get the same result but faster, and this doesn’t do that.
If you use early_fragment_tests, there’s no point in writing gl_FragDepth because that variable will be ignored. The interpolated value will be written to the depth buffer and used to determine whether the depth test passes or fails.
I found this blog that touches on the topic: Issues using Conservative Depth to stop the fragment shader runnig
From its information, my issue that that I am using imageStore(), which is disabling early-z-culling. 
If only I was using Vulkan, then VK_AMD_shader_early_and_late_fragment_tests would help. 