Depth prepass causes z-fighting.

grumbler · June 12, 2018, 11:34pm

Cannot get depth prepass to work as calculated depth values occasionally differ. Depth test fails in draw pass sporadically (sometimes for some view angles all depth values pass equality check in draw pass, sometimes nearly half will fail).

changing depth near and far from 0.1-1000.0 to 0.5-100.0 makes no perceivable difference in amount of z fighting (besides visibly cutting off some of the scene).
depth buffer format: float32
validation layer has nothing to complain about (LunarG from VulkanSDK 1.1.73.0).
pipelines are nearly identical (differences: color attachments, descriptor sets for textures, depth write, depth compare op - LESS for prepass and EQUAL later).
no cache used for pipeline nor shader compilation (shaders always recompiled via shaderc from VulkanSDK 1.1.73.0).
shaders compiled with target environment vulkan and set warnings as errors.
gl_Position is decorated with “invariant” for both and both have the exact same code calculating it.
depth prepass has no fragment shader (having a dummy one that has nothing todo, but has “layout(early_fragment_tests) in;” - makes no difference).
draw pass has fragment shader with “layout(early_fragment_tests) in;” and never does anything with depth.
did not notice anything weird with the draw calls and associated state with NSight - except:
glsl decompilation in it has different code appended to gl_Position calculation (both: “gl_Position.y = -gl_Position.y;” and only one has “gl_Position.z = 2.0 * gl_Position.z - gl_Position.w;”)

I suspect the discrepancy in the decompile to be NSight specific as SPIR-V does not seem to have any of that extra code for either shader (also, see the opening statement). But it is suspicious - something must look different to NSight for it to trigger only for one.

I am out of ideas how to proceed. Ideas?

Minimal shaders that cause problems (slight deviations in the calculated result with identical input [same vertex buffer is sent to both]). Code is captured as shaderc receives it (to be sure i don’t send some wrong code and NSight also agrees - with the discrepancy as described earlier):

// depth prepass
#version 450
#pragma shader_stage(vertex)
#extension GL_ARB_separate_shader_objects : enable

layout(location=0) in vec3 inPos;

invariant gl_Position;
layout(push_constant) uniform Push { vec4 proj, pos, rot; } par;

vec4 projection(vec3 v) { return vec4(v.xy * par.proj.xy, v.z * par.proj.z + par.proj.w, -v.z); }
vec3 qrot(vec4 q, vec3 v) { return v + 2.0 * cross(q.xyz, cross(q.xyz, v) + q.w * v); }
vec4 qinv(vec4 q) { return vec4(-q.xyz, q.w); }
vec3 transInv(vec3 v, vec4 pos, vec4 rot) { return qrot(qinv(rot), (v - pos.xyz) / pos.w); }

vec3 projAndGetPos() {
    vec3 pos = inPos * (32767.0 / 1024.0);
    gl_Position = projection(transInv(pos, par.pos, par.rot));
    return pos;
}

void main() {
    projAndGetPos();
}

// draw pass
#version 450
#pragma shader_stage(vertex)
#extension GL_ARB_separate_shader_objects : enable

layout(location=0) in vec3 inPos;
layout(location=1) in vec2 inSelColorTex;
layout(location=2) in vec4 inNormSelCover;
layout(location=3) in vec4 inTexSet;

invariant gl_Position;
layout(push_constant) uniform Push { vec4 proj, pos, rot; } par;
layout(location=0) out Frag { vec3 pos; float selTex; vec4 color; vec3 normal; float selCover; vec3 tocam; flat vec4 texSet; } sOut;

vec4 projection(vec3 v) { return vec4(v.xy * par.proj.xy, v.z * par.proj.z + par.proj.w, -v.z); }
vec3 qrot(vec4 q, vec3 v) { return v + 2.0 * cross(q.xyz, cross(q.xyz, v) + q.w * v); }
vec4 qinv(vec4 q) { return vec4(-q.xyz, q.w); }
vec3 transInv(vec3 v, vec4 pos, vec4 rot) { return qrot(qinv(rot), (v - pos.xyz) / pos.w); }

vec3 projAndGetPos() {
   vec3 pos = inPos * (32767.0 / 1024.0);
   gl_Position = projection(transInv(pos, par.pos, par.rot));
   return pos;
}

void main() {
    sOut.pos = projAndGetPos();
    sOut.selTex = inSelColorTex.y;
    sOut.color = vec4(0.0);
    sOut.normal = inNormSelCover.xyz;
    sOut.selCover = inNormSelCover.w;
    sOut.tocam = par.pos.xyz - sOut.pos;
    sOut.texSet = inTexSet * 255.0;
}

live3v1l · June 16, 2018, 5:10am

Disassemble compiled shader code to SPIR-V assembly and check for inconsistence
Do you see Z-figthing when you clear depth buffer after first pass and changing depth compare to LESS ?

krOoze · June 16, 2018, 9:21am

That would be NDC conversion from Vulkan to OpenGL. Maybe a hack in their codebase?

Make sure NaNs won’t happen. NaN is inequal to everything, including itself.

SFLOAT depth buffer can be weird for multiple other reasons. Does it work with UNORM format?

grumbler · June 16, 2018, 10:00am

SPIR-V is rather difficult to read - i did not see anything weird. I primarily looked to make sure the compiler did not add any code (y-invert, clipspace adjustments) and that my invariant decoration was intact. As far as i can tell - no, it did not add anything that my glsl did not say. If anyone is more proficient reading that ‘gibberish’ or has an idea what to look out for: https://pastebin.com/2Kx7LGsN (depth prepass), https://pastebin.com/WJhymWkG (draw pass).
Tried that for sanity sake - no z-fighting. Depth pass and draw pass are supasses of the same renderpass - so, used vkCmdClearAttachments in draw pass.

Renderpass creation (pardon the custom wrapper):


        GlwRenderPass::Desc desc;
        desc.attach(formatDepth).color(VK_ATTACHMENT_LOAD_OP_CLEAR, VK_ATTACHMENT_STORE_OP_DONT_CARE).layout(VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL);
        desc.attach(formatAccum).color(VK_ATTACHMENT_LOAD_OP_CLEAR, VK_ATTACHMENT_STORE_OP_STORE).layout(VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL);
        desc.attach(formatColor).color(VK_ATTACHMENT_LOAD_OP_DONT_CARE, VK_ATTACHMENT_STORE_OP_DONT_CARE).layout(VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL);
        desc.attach(formatParam).color(VK_ATTACHMENT_LOAD_OP_DONT_CARE, VK_ATTACHMENT_STORE_OP_DONT_CARE).layout(VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL);
        desc.subpass().bindDepth(0);
        desc.subpass().bindDepth(0).bindColor(1).bindColor(2).bindColor(3);
        desc.dependency(VK_SUBPASS_EXTERNAL, VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT         , VK_ACCESS_MEMORY_READ_BIT
                       ,0                  , VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT);
        desc.dependency(0                  , VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
                       ,1                  , VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT        , VK_ACCESS_SHADER_READ_BIT);
        desc.dependency(1                  , VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
                       ,VK_SUBPASS_EXTERNAL, VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT         , VK_ACCESS_MEMORY_READ_BIT);
        if(!passMain.init(desc)) return false;

Should be overly restrictive enough. Also, i have confirmed via NSight that what is written is what driver got.

For sanity went over the two failing test draw calls in NSight again:

pipeline: pipeline object id and subpass differs (0 = depth prepass, 1 = draw pass). Ok.
renderpass: same renderpass object id, current subpass differs. Ok.
FBO: no differences. Ok.
Input assembly: prepass uses only position, draw uses also the rest of vertex data. Ok.
VS: interface differs (the expected extra vertex attributes and presence of outputs). Ok.
Rasterization state: no differences. Ok.
Pix Ops: depth op and write enable differs. Draw pass has color attachments. Ok.
The rest is identical (ex: exact same push constants etc) - unless i am going blind.

grumbler · June 16, 2018, 10:11am

Possibly. And i suspect it to be misbehaving for some reason (it now adds the z and y adjustments to only one of the shaders).

NaN issue is unlikely to produce z-fighting patterns while the correct model with correct projection shows up.

Tried VK_FORMAT_D16_UNORM. Expectedly there is A LOT less z-fighting - but it is still there.

grumbler · June 16, 2018, 10:27am

Example image to clarify how it looks: screenshot 2018 06 16 20 17 13 957 — Postimages (model via surface nets with triangle sizes evident from z-fighting patterns)

The patterns are stable (ie. if i do not move then the exact same image is drawn).
The patterns are heavily dependant on camera (projection, location, rotation) and change a lot.
There are views that do not have any errors.

krOoze · June 16, 2018, 3:13pm

You said they fail the EQUAL test. That is not really “z-fighting”, is it?

Should not your 0-1 subpass debendency be dstStage=VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT and srcAccess=VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT.

Could always be the driver ignoring Invariant, I guess. Do you have access to a GPU from different vendor?

grumbler · June 18, 2018, 12:48pm

Yes, it is z-fighting. The semi-random disagreement of equality is the cause of the visual z-fighting effect. Z-fighting describes the visual effect, not any specific cause of it (insufficient depth precision for the close/intersecting geometry OR insufficient offset for co-planar geometry OR my case - unexpected variance in vertex position).

Yes. It should. Fixed. Sadly, was not the cause of the problem (Did not expect it to: synchronization errors are unlikely to result 100% stable artifacts, but good catch. Never noticed it).

Invariance section in vulkan spec does not seem to cover this. However, SPIR-V does - invariant does not look to be ignoreable. If it is a driver bug then it must be a very special case one - i cannot be the only one to expect invariance to work. (Note: updated my drivers in the beginning of this month. Win7.)

Nope.

live3v1l · June 20, 2018, 5:38am

I can test it on my GPUs if you provide executable

grumbler · June 20, 2018, 10:24am

[QUOTE=live3v1l;43586]I can test it on my GPUs if you provide executable[/QUOTE]Thanks for the offer, but unfortunately the project has too many dependencies for that to be done at this time :(.

There has been some slight development.

I managed to find the spot where the vertex position calculations diverge (due to the driver inexplicably ignoring “invariant” decoration).

    // same for both depth prepass and draw pass
    vec3 pos = inPos * (32767.0 / 64.0);
    gl_Position = projection(transInv(pos, par.pos, par.rot));

If i use “sOut.pos = pos;” in draw pass then vertex coordinates will diverge. If i use “sOut.pos = vec3(0.0);” then everything works as intended (except that i actually do require the position in real code).

Possibly precision loss due to registry spill differences? Either way, at least i can say fairly certainly that there is nothing else going on and the driver just ignores or fails to adhere “invariant” - which as far as i can tell is a bug (Win7 x64, GeForce GTX 660, driver 397.93). This is pretty much the worst case scenario in every way. Would love that to not be the case, but i can not see any other explanation.

The only workaround i have found is: BOTH shaders must emit ‘pos’ (that way the driver will HAPPEN to generate the same code - “invariant” does nothing and can be omitted randomly or just everywhere).

Anyone know some full-code minimal vulkan example or something that could be easy to modify for bug verification/reproduction purposes?

krOoze · June 21, 2018, 2:23pm

Hmmm, try to simplify the expressions. Try just:


void main() {
    vec3 pos = inPos * (32767.0 / 1024.0);
    gl_Position = projection(transInv(pos, par.pos, par.rot));
    ...
}

To be fair it does seem to do some weird stuff in SPIR-V inside projAndGetPos in the depth-prepass vs the draw.

Heh, it’s weird that the SPIR-V shows OpConstant %6 511.984, but 32767.0 / 1024.0 is 31.999 on my calculator…

grumbler · June 22, 2018, 6:06am

Did not specially mention, but when i was hunting for the cause of divergence - i pretty much tried everything. Including expanding ALL my functions into plain code in main etc. Separating ‘pos’ calculation in ‘projection’ and ‘sOut.pos’ calculation. None of it matters as common sub-expression elimination will rewrite it all to the same code anyway - as expected (ie. no driver errors there it seems).

Also, it seemed (ie. it can be purely coincidence also) that nvidia drivers are able to optimize (eliminate dead code for example) over shader boundaries as i was able trigger/solve the error purely with changes in fragment shader (using ‘sOut.pos’ from vertex shader vs not using it).

Thous definitely should not differ in any way … unless dead code elimination rewrote thous (while still adhering ‘invariant’ restrictions). Did not notice differences, but i might have looked at the wrong place without realizing (SPIR-V is a bit hard to read to say the least). Investigating …

My apologies. There was a scale change between the posts of source and spir-v, the division changed to 64.0 (~= 511.984). The change was so minor that i forgot to point it out later when posting SPIR-V.

grumbler · June 22, 2018, 6:23am

projAndGetPos investigation. The function seems to be the last in both spir-v pastebin posts (%31). As far as i can see:

depth prepass one starts at line 184.
draw pass one starts at line 248.
both are completely identical (inc. all % reference numbers).

krOoze · June 22, 2018, 9:52am

Well, I am giving increasingly desperate suggestions… The gl_position actually seems to be written outside the projAndGetPos function in the SPIR-V. And main() seems unused; but I won’t pretend to understand auto-generated SPIR-V.

SDK 77 is out, any luck? You may also try precompiled master at Releases · KhronosGroup/glslang · GitHub