why is early z-cull disabled when touching z-values in a fragment shader? Is there any hardware reason for that ?
For front-to-back rendering algorithms it would be ideal to be able to write a z-value (let’s say 0) in the fragment program to exclude a fragment from further computations (for example, if a certain opacity has been accumulated). So early-z-cull should be enabled as well as writing of z-values in the fragment shader. Currently, to have early z-culls and z-writes, two rendering passes are required.
Should there be an API or a glEnable/glDisable(GL_EARLY_DEPTH_TEST) call to let the user decide whether early z-culls are enabled or disabled ?
BTW: Early-z-culls do not seem to work on NV30 and drivers 43.35 (at least when using any fragment program). Can anyone share this experience ?
[This message has been edited by KlausE (edited 04-02-2003).]
Early z cull relies on knowing early on that depth fragments will fail the test, typically for small regions of the screen in fact. If your fragment program is going to change z fragments then the hardware cannot evaluate whether a fragment (or more likely a group of fragments from a primitive) will fail the test without evaluating the shader. You might want to try the z invariance shader trick for fragment programs and see if that fixes early z cull for fragment rasterization, but that part is just a wild guess.
[This message has been edited by dorbie (edited 04-02-2003).]
Thanks Dorbie for the quick response.
Hm, that sounds reasonable. However, the hardware should not care about me changing the depth value in the fragment program, as the depth value i want write into the depth buffer is for the next rendering pass. The decision should be based on the current contents of the depth buffer and the depth value from the primitive.
So setting a flag for the hardware not to care about changing z-value in a fragment program would be real handy. That would save the additional rendering pass for setting z-values.
I haven’t heard about the z invariance shader trick. What is it or where can i find infos about it ?
Typically, earlyZ means that it can happen before shading. It does not mean that the HW can keep testing the Z val while shading is on-going If you want functionality like this, you should try using a KIL instruction in the shader. It may not improve your performance due to HW restrictions or creating a bubble in the pipeline, but it will help when-ever the HW can do something better. (Also keep in mind this turns off early Z)
interesting that a kil also disables early-z.
Does the driver disable early z-culls as soon as i fragment program is downloaded that changes z-values ? (if yes, why not provide an api to tell the driver not to do that?)
Or is a different path used in the hardware for such shaders ?
I see no reason why KIL by itself would disable early Z.
Even though you write Z for the next pass, the Z that you output is what the spec says is going to be tested against in the Z test. Thus, it can’t know what value to test until after the fragment shader, if you touch it in the fragment shader.
I should have been more clear. KIL doesn’t stop all early depth testing just some in certain circumstances. It doesn’t turn off hierarchical Z on ATI chips, but it will typically switch off early depth testing, because the HW only tests either before or after shading. If you KIL and do an early depth test you could end up writing a Z that isn’t there. (The same issue exists with alpha testing.) This can be avoided if you aren’t updating the depth value.
As for why to not use it as an extension, it typically gets all the cases the HW can handle and produce a sensible result as is. In the particular case you mention, the only depth value it could test against is the interpolated depth value. Since you are explicitly setting the depth value, that value is not the correct one to test against.
Klaus, you cannot both write a new depth value and test against the ‘primitive’ value. The depth value you write is the depth value used for the test. The majority of people want it this way because they want to both test & write using the same depth value.
Conceptually the depth test and write is a framebuffer operation that happens after the shader. You don’t have any mechanism to alter it in the way you suggest there is only one depth result register.
P.S. My invariance suggestion really relates to vertex programs, with OPTION ARB_position_invariant, it was a brain fart. A neuron misfired somewhere and I figured you might be breaking coarse z by not guaranteeing this. It’s an orthogonal issue anyway. The rest of the earlier post still stands.
[This message has been edited by dorbie (edited 04-04-2003).]