Improving deferred shading

Sounds about right.

What do they do differently?

I ask, because i got a piece of DX sample code, that i just don’t get to work with OpenGL, which depends on the scene-depth.

Thanks,
Jan.

OpenGL and D3D define clip space differently. In OpenGL the vertex positions as output from the vertex shader are clipped so that -w <= x <= w, -w <= y <= w, -w <= z <= w. In other words, normalized device coordinates (NDC, after w divide) are in the range [-1, 1] for x, y and z. As part of the viewport transform z is then mapped to the range [n, f] (the parameters to glDepthRange) like this:
Zeye = Zndc * (f-n)/2 + (n+f)/2

D3D treats z differently in that it is clipped so 0 <= z <= w. In other words, the NDC z axis ranges from 0 to 1, and the depth range mapping is
Zeye = Zndc * (f-n) + n

You will notice that for this reason projection matrices for D3D and OpenGL generated using the same near and far clip distance look different.

Btw, have you ever thought about using a depth texture as depth buffer and reconstructing the worldcoordinate of the pixels using that depth texture? It has some advantges:

  1. the depth texture format is not bound to the color buffer format of the g-buffer
  2. you have extra room in your g-buffer rendertargets
  3. reconstruction is easy:

uniform mat4 u_unproject;
uniform sampler2D tex_depth;

const float depth = texture2D(tex_depth, texcoord).r;
vec4 pos = u_unproject * vec4(gl_FragCoord.xy, depth, 1.0);
pos.xyz /= pos.w; 

But I’m unsure, if you loose Early-Z-Culling, if you’re using render-to-depth texture…

Another tip: you can use the depth bounds test when rendering the light quads for limited lights. Just project the bounding box of the light into viewspace, then use the x/y extends as scissor rect (or for the size of the quad). The z extends of the bbox can be used for the depth bounds test. This should help especially for (largely) occluded lights.

that was actually my inital question, but none of the suggested approaches worked for me.

how would i setup mat4 unproject?

thanks!

Early-Z should be functional, however, you may see a small performance drop vs. a regular depth buffer from the loss of Z-compression (because it’ll need to be uncompressed for the texture unit to read it).

This would result in a slightly larger than neccessary box. I’d recommend using this math to get tight bounds:
http://www.gamasutra.com/features/20021011/lengyel_06.htm

Or you can check the GS code in my deferred shading demo for a shader implementation of the above (including z-bounds).

It’s simply the inverse of the modelview-projection matrix, plus you need to bake in a scale-bias to put gl_FragCoord.xy into [-1…1] range. You could alternatively use the texture coordinate instead of gl_FragCoord, which would make the scale-bias texCoord * 2.0 - 1.0.

here’s what i tried, modelview & projection matrices being the same as when i rendered the geometry into the MRT:

const float Depth = texture2D( u_DepthTexture, v_Coordinates ).r;
vec4 Position = gl_ModelViewProjectionMatrixInversevec4( (v_Coordinates2.0)-1.0, Depth, 1.0 );
Position.xyz /= Position.w;

nothing. is there a way to check what the output is? gl_FragColor = Position is not very helpful, as it’s clamped.

the gl_ModelViewProjectionMatrixInverse is definitive the wrong matrix. It would project into the world space. Another disadvantage is that the full matrix multiplication creates more instructions than the previous approaches. A solution based on raydirection and distance (calculated from the Z buffer is much easier to debug. (render the direction as color and two perpendicular regular red/green gradients should appear)

A light bounding box with scissor test won’t be faster than a light bounding volume, because a volume reject more pixels, through the the better shape and the Z rejection (Zfail with backfaces or ZPass with Frontfaces)
The depth bound test can’t be used for deferred lights, because the fragments depth is independent from the rendered light volume or fullscreen quad depth.

I upload u_unproject once per frame (view).
It contains the following matrix:

u_unproject = (V * P) ^ - 1

where V is the matrix that transforms from NDC to viewport coordinates (along with the depth-range mapping). P is the projection in use.
Using this matrix, you end up in eyespace.

Another disadvantage is that the full matrix multiplication creates more instructions than the previous approaches.

Sad but true :slight_smile: Maybe, if you take a close look at the numbers, there’s a shortcut…

A solution based on raydirection and distance…

The nice thing about it is, that it works for orthogonal projections out-of-the box; where raydirection+distance wouldn’t.

The depth bound test can’t be used for deferred lights, because the fragments depth is independent from the rendered light volume or fullscreen quad depth.

The depth bounds test can be very well used for the deferred lights, since it only depends on the depth already in the framebuffer, not on the fragment’s depth. Not available on ATI, though.

For this I usually just multiply it with a constant to bring the expect range down to [0…1]. Like if your world is about 1000 units large, you just output position * 0.001. Alternatively you can output something like (position.xyz > min) && (position.xyz < max) which should highlight a box around the specified coords for intersecting geometry if your math is right.

Btw, I think you may need to use 2.0 * depth - 1.0 instead of just depth because IIRC sampling a depth texture returns values in [0…1].

Yes, but that’s what we want isn’t it?

The difference is about 3 instructions. Probably not critical in a generally bandwidth heavy application (which deferred shading typically is) but saving instructions is of course always good. Depending on what you’re doing though the matrix trick could be better, like if you need to multiply the result with a matrix anyway, like for shadowmapping, they could be baked into the same matrix.

Typo? The inverse of the model-view-projection matrix obviously projects from clip space to model space. View space may also be more convenient than world space. skynet got it right.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.