as far as i know the z-buffer’s values are non-linear in [0, 1] with 0==clipNear and 1==clipFar.
how can i transform the z-buffer’s values to a linear scale? (e.g. a value delta of 0.1 is the same distance independent of the corresponding values.)
Not if you’re limited to OpenGL 2 (and don’t have the ARB_framebuffer_object extension).
If the clip-space coordinates don’t have constant W, the depth-buffer values will be non-linear with respect to eye-space Z. For a perspective projection, clip-space W is typically equal to eye-space -Z.
The values in the depth buffer are equal to (Z/W+1)/2, where Z and W are the clip-space Z and W, linearly interpolated across the primitive.
If you know how both clip-space Z and W relate to eye-space Z (i.e. you know the projection matrix and you also know that eye-space W is always 1), then you can obtain eye-space Z from the depth values.
For a projection matrix of the form
[ ? ? ? ? ]
[ ? ? ? ? ]
[ 0 0 A B ]
[ 0 0 -1 0 ]
and assuming Weye = 1, you have
Zclip = A*Zeye + B
Wclip = -Zeye
Conversion to NDC involves dividing clip-space X,Y,Z by W:
Zndc = Zclip/Wclip
Finally, NDC Z (which lies in the range -1 to +1) is converted to depth (in the range 0 to 1) by
depth = (Zndc+1)/2
(Actually, it can be slightly more complex than that if the depth range is changed with glDepthRange()).
If the matrix is a typical perspective matrix (generated with e.g. gluPerspective(), glFrustum(), or equivalent), then
A = (zNear+zFar)/(zNear-zFar)
B = 2zNearzFar/(zNear-zFar)
In which case, the equation for Zeye becomes
Zeye = -zFarzNear/(depthzNear + (1-depth)zFar);
Zeye = -zFarzNear/(zFar + depth*(zNear - zFar));
Note that typically A and B are both negative, and Zeye is negative for anything that’s visible, with Zeye becoming increasingly negative as you get farther from the viewpoint (i.e. +Zeye is pointing out of the screen).
As your diagram suggests, Zeye is the distance in front of the viewpoint, not the distance from the viewpoint. The distance from the viewpoint is the magnitude of the eye-space position vector (Xeye,Yeye,Zeye).
They can be determined from the window coordinates (i.e. the row and column indices within the data returned by glReadPixels).
Given window coordinates (Xwin,Ywin), where (0,0) is the lower-left corner of the window, (width,height) is the upper-left corner and (0.5,0,5) is the centre of the lower-left pixel, the conversion from NDC to window coordinates is given by the viewport transformation, set with glViewport(Xv,Yv,Wv,Hv) (the first time a context is bound to a window, the viewport is automatically set to cover the entire window as if by glViewport(0,0,width,height); thereafter, the viewport must be set explicitly if the window is resized).
For a perspective transformation, Wclip is proportional to Zeye, which you’ve already calculated. Typically, it’s equal to -Zeye. If you’ve managed to calculate Zeye correctly from Zndc, then you must already know Wclip.
Conversion from eye coordinates to clip coordinates is given by the projection matrix. In the general case, you would need to invert that matrix. But a projection matrix generated by gluPerspective() or glFrustum() always has the form
[Sx 0 Kx 0]
[ 0 Sy Ky 0]
[ ? ? ? ?]
[ ? ? ? ?]
so the conversion is
Xclip = Sx * Xeye + Kx * Zeye
Yclip = Sy * Yeye + Ky * Zeye
and the inverse is
Xeye = (Xclip - Kx * Zeye) / Sx
Yeye = (Yclip - Ky * Zeye) / Sy