When and how does openGL calculate F_depth(depth value)

I have been following learnopenGL and in the depth testing chapter he says:

Depth testing is done in screen space after the fragment shader has run.

Meaning at this point the projection has already been done. This article gives us the projection matrix used by OpenGL, and the factor that affect the z-coordinate of a point is the row:

[ 0    0    -(f+n)/(f-n)    -2fn/(f-n) ]  

Note, this matrix is computed to make the ‘pyramidal’ frustum to a unit cube. Meaning the z-coordinate has also been mapped to [0,1] after this matrix is applied.

Then, the author in the depth value precision chapter tells us:
These z-values in view space can be any values between frustum’s near and far plane and we need some way to transform them to [0,1].
The question is why at this point, when we had already mapped it while applying the projection matrix.

Also, he says:
a linear depth buffer like this:

F_depth=z-near/(far-near) is never used, for correct projection properties a non-linear depth equation is used:
F_depth= (1/z- 1/near)/(1/far - 1/near)

But, as we have seen, z is mapped within range using:

[ 0    0    -(f+n)/(f-n)    -2fn/(f-n) ]    

Which appears to be linear.

All these contradicting statements are making me really confused on when is the depth for fragments calculated and compared,and what is the equation actually used to compute this. In my understanding nothing more for depth should be calculated after the OpenGL projection matrix is applied, but after reading this I’m really confused. Any clarifications would be greatly appreciated. Thanks in advace.

You should probably do the math on some hypothetical values to confirm it. You’ll likely find yourself with values quite far outsize of the [0, 1] range.

Because you haven’t. “View/eye space”, as explained in that chapter, is the space before you apply the projection matrix.

You forgot to divide by the W component of the vertex position.

This isn’t necessarily true. If the fragment shader enables early fragment tests explicitly with layout(early_fragment_tests) or if the implementation can determine that it doesn’t matter when the depth test is performed (fragment shader doesn’t write to gl_FragDepth, use discard, or have side effects such as atomics or imageStore), the depth test and write will be performed before the fragment shader is executed. If the depth test fails, the fragment shader won’t be executed.

After clipping and projective division, x, y and z are all in the range [-1,1]. x and y are then mapped to screen coordinates by the viewport transformation, z is (typically) mapped to [0,1] (or some subrange of that) by the transformation established by glDepthRange. Although that can be changed via glEnable(GL_DEPTH_CLAMP), which disables near and far plane clipping; this only makes sense if you’re using a floating-point depth buffer.

It’s linear in homogeneous coordinates, but assigning a non-constant value to w means that it’s rational in NDC. However, as x, y and z all have the same (non-constant) divisor, a plane in eye space is planar in NDC, i.e. depth is an affine function of screen space x and y.

If you expand out the transformation for the z and w components, you get
zclip = -(f+n)/(f-n) * zeye - 2*f*n/(f-n) * weye
wclip = -zeye
Conversion to NDC via projective division gives (assuming weye = 1)
zNDC = zclip / wclip
= (f+n)/(f-n) + 2*f*n/(f-n) / zeye
IOW, NDC z is the reciprocal of eye-space z with scaling and translation to map the near plane to -1 and the far plane to +1.

Could you explain me what do you mean by:

when we multiply M.T where M is the matrix and T is the point, the point is still [x y z 1] and the last row of M is still [0 0 -1 0], so a constant value is there in w, how’d it be rational in one and linear in another space?

Also,
I have one confusion then the F_depth formula must have some sort of mistake:

F_depth= (1/z- 1/near)/(1/far - 1/near);  

consider when near, far > 1; assuming this depth calculation is done after the projection.
|1/near| < 1 always,
but |1/z| >= 1 always,
and |(1/far - 1/near)| < 1 always,
which means F_depth > 1 always ?

That last row sets wclip to -zeye. Normalised device coordinates (NDC) are obtained by dividing clip coordinates by the w component, and zeye typically isn’t constant. That’s what gives you a perspective projection: the scale factor is inversely proportion to |z|. An orthographic projection has the last row as [0 0 0 1], so wclip=weye which is invariably constant 1.

In that equation, z is zeye. It’s composing the projection transformation, conversion from clip space to NDC (projective division), and the depth range transformation all into one.

Also, note that the formula assumes glDepthRange(0,1). This is the initial state, but it can be changed. E.g. if you’re rendering a view from inside a cockpit, it’s quite common to reserve the closest depth values for rendering the cockpit and use different near/far planes. The cockpit needs a small near distance (less than a metre), but using the same near distance for the environment will result in a very coarse depth resolution.

The transformation from zeye to the value written to the depth buffer has 3 steps.

  1. The projection transformation:
    zclip = C*zeye+D*weye
    wclip = -zeye
    where C=-(f+n)/(f-n), D=-2fn/(f-n).
  2. Projective division:
    zNDC = zclip/wclip
  3. Depth range:
    depth = n + (f-n)*(zNDC+1)/2
    where glDepthRange(n,f).

Projective division is fixed, but the other two steps are under program control.

No. For points which lie inside the clip volume, -zeye>near => |1/zeye|<1/near (assuming weye=1, which is invariably the case).

Could you please give me a more insight(demostration if possible, but maybe that’d be a big request?) on what sort of a composition is this? It seems to be different from functional composition or matrix composition .

Also,

This seems to be the step used to map NDC to [0,1] because I followed projection matrix construction here and here and it only mentions the step 1 and 2.
How does this function map NDC to [0,1], I’ve analysed it like this:
depth = n + (f-n)*(zNDC+1)/2
=>
we know, zNDC=[-1,1]

=> n+(f-n)( ([-1,1]+1) /2)
=>n+(f-n)
( [0,2] /2 )
=>n+(f-n)*[0,1]
=>n+[0, f-n]
=>[n,f]
so, depth is mappedto [n,f] possibly not equal to [0,1]?

It’s just the three steps outlined previously, substituting the variables calculated in each step into the following step.

The main point is that the combination of the first two steps gives zNDC=-D/zeye-C, i.e. zNDC is basically the reciprocal of zeye, scaled and translated so that zeye=-near => zNDC=-1 and zeye=-far => zNDC=1. The last step is just an affine mapping (scale and translate) from [-1,1] to the chosen depth range.

Correct.

Note that glDepthRange clamps its arguments to [0,1], but it doesn’t require n<f.

Sorry I’m not getting something that might look simple to you, but, isn’t depth supposed to be in [0,1] , you said previously that depth buffer can only hold values in [0,1], but [-1,1] is being mapped to [n, f] where n and f are near and far plane’s resp. coordinate value, which could be anything.

No, it’s not. Depth near/far maps the [-1, 1] range to the [0, 1] range. This is part of the viewport transform, which turns NDC-space positions into window space positions.

The in x, y z, 1 view space coords, upto the point it gets transformed to NDC , everything is mapped to [-1,1] , then an additional mapping is done for depth value to map [-1,1] to [0,1]. Disregarding the viewport transformation at the moment, (which probably is later?) I was trying to know how this mapping is done ,

  1. the learnopenGL resource is telling me it is done with this eqn:
    F_depth = (1/z-1/near)/(1/far-1/near)
  2. above cgelements is telling me (as I understand) this is depth range and is calculated using:
    depth = n + (f-n)*(zNDC+1)/2 , but I pointed out:

which may not be mapping [-1,1] to [0,1] since n,f have arbitrary values. So, I’m basically still being confused how’s the [-1,1] to [0,1] mapping happening, which equation is used?

View space is not transformed directly into NDC space. View space is transformed into clip-space, where the W component is not necessarily 1. The transformation into NDC space divides the clip-space XYZ by W, then clamps to [-1, 1].

But the viewport transformation is what maps the depth from [-1, 1] to [0,1]. You can’t disregard the very thing you’re talking about.

You seem to have a great deal of trouble distinguishing all of the spaces involved. The spaces are:

  1. Eye/Camera/view space. Here, W = 1 (generally).

  2. Clip-space. The result of transforming the previous by the perspective matrix. W is generally not 1.

  3. NDC space. The result of dividing the clip-space XYZ by clip-space W, which after clipping results in all positions being on the range [-1, 1] in all three axes.

    The perspective matrix sets up the clip-space W in such a way that this division performs the final part projection: dividing by the eye/camera/view-space Z component.

  4. Window space. The result of transforming the NDC space position into window coordinates. The XY of window space is within the range of the viewport, and the Z is within the range [0, 1] (or more specifically, the depth near/far values, which are not the same near/far values as those in the projection matrix).

Okay, so, mapping [-1,1] to [0,1] is the window transformation, which is equivalent to :

which is equivalent to :
this wiki formula you’ve linked to?

which is equivalent to:
the learnopenGL formulat to map [-1,1] to [0,1] ?
Are these all the same thing?

And I understood by your above explanation that XY coords after the NDC transformation aren’t altered for viewport transformation: but the link you’ve linked to alters them

And by this:

do you mean only the min and max values for the window transformed depth values?

Why would you get that impression? The formulas he typed referred specifically to depth computations because that’s what this entire discussion is about. You shouldn’t take that to mean that nothing else is going on.

The values passed to glDepthRange and equivalents are called “near” and “far”. So to talk about them, you have to use terminology that overlaps with the “near” and “far” values from the perspective matrix, even though they do different things.

so, this n and f in this formula aren’t near and far plane values? I just read Alfonse’s comment and thought that might be the case

But, if n and f below are not near and far plane values, then, how was the formula:
F_depth = (1/z-1/near)/(1/far-1/near)
derived by composition?

GClements has said above:

After clipping and projective division, x, y and z are all in the range [-1,1]. x and y are then mapped to screen coordinates by the viewport transformation, z is (typically) mapped to [0,1] (or some subrange of that) by the transformation established by glDepthRange .

So, according to this, mapping z from NDC [-1,1] to [0,1] is the depth range transformation and it’s formula is:
depth = n+(f-n)*(zNDC +1)/2
but, you say that it is the window/viewport transformation and provide this formula . These aren’t the same, are they? .Could you please clarify this one last thing?

learnopenGL is assuming that you never call glDepthRange so the initial state (equivalent to glDepthRange(0,1)) persists. IOW, it’s oversimplifying it.

Correct. The viewport transformation maps the [-1,1] NDC range to the [n,f] range established by glDepthRange.

Note the reference pages for glFrustum and glDepthRange use nearVal and farVal for different things. For glFrustum, they specify the -zeye values which map to -1 and +1 in NDC. For glDepthRange they specify the depth values to which NDC -1 and +1 are mapped.

So to clarify: if we call the parameters passed to glFrustum (or equivalent, e.g. gluPerspective or glm::perspective) zNear and zFar and the parameters passed to glDepthRange depthNear and depthFar, then
zeye=-zNear => zNDC=-1 => depth=depthNear
zeye=-zFar => zNDC=1 => depth=depthFar

The mapping from zeye to zNDC isn’t linear (or affine), it’s (assuming weye=1) zNDC=-D/zeye-C where C,D are the values in the third row of the projection matrix and are calculated from zNear and zFar so that [-zNear,-zFar] maps to [-1,1]. The viewport mapping (from zNDC to depth) is affine (i.e. f(x)=a*x+b) with the scale and offset calculated to map [-1,1] to [depthNear,depthFar].

The final depth range transformation is linear? And the formula given in learnopenGL:
F_depth = 1/z-1/near/(1/far - 1/near) is only non-linear because it contains the perspective divide also within the formula?(somehow composed as you had mentioned)

I tried to compose the formula as you had mentioned, however I get something different:

Also, alfonse mentioned has mentioned :

View space is not transformed directly into NDC space. View space is transformed into clip-space , where the W component is not necessarily 1. The transformation into NDC space divides the clip-space XYZ by W, then clamps to [-1, 1].

But the viewport transformation is what maps the depth from [-1, 1] to [0,1]. You can’t disregard the very thing you’re talking about.

Is the depth range transformation equivalent to the window viewport transformation?(At least for the z -axis)?

Yes.

That’s correct if you use n=0,f=1 in step 3 (so n,f are the near/far values for step 1). You’ll note that if you put zeye=-n you get depth=0 and zeye=-f gives depth=1.

Overall, the expression is:
depth = (N*n-F*f)/(n-f) + f*n*(N-F)/((n-f)*zeye)
where n,f are the near/far planes used to generate the projection matrix in step 1 and N,F are the near/far depth values passed to glDepthRange in step 3.

If you put N=0,F=1, it simplifies to:
depth = f/(f-n) + f*n/((f-n)*zeye)
= f/(f-n) * (1+n/zeye)
= (1+n/zeye) / ((f-n)/f)
= (1+n/zeye) / (1-n/f)
which is what you have.