I notice in the derivation of the OpenGL orthographic projection matrix, as detailed in either Angel-Shreiner text or Akenine-Moller (Real Time Rendering), the z-component ends up having a negative sign, which has result of a mirror-reflection of the model about the z=0 plane.

Moller comments that this “converts from the right-handed viewing coordinate system (looking down the negative z-axis to left-handed normalized device coordinates.”

This serves to confuse the crap out of me.
Does the orthographic matrix contain the Z-coordinate mirror-reflection to “undo” another reflection which is built into the graphics HW pipeline?

Why is the “canonical view cube” mirror reflected w.r.t. z-coordinate?

We are taught to think of the camera as looking by default from the origin in direction of the negative-Z axis, but does this take into account this reflection?? After the reflection are we looking from origin in direction of +Z axis?

Can someone please explain what is going on here and why?

The PROJECTION matrix just transforms from EYE-SPACE to CLIP-SPACE. The perspective divide takes you from CLIP-SPACE to NDC-SPACE (the “canonical view cube”).

EYE-SPACE is right-handed (with +Z pointing “behind” you). NDC-SPACE is left-handed (with +Z pointing “in front” of you). So the PROJECTION matrix has to handle this Z flip.

If you want to follow the derivation through, here’s one place you can look:

I’m not completely sure. “Issues” #3 here might provide a clue why: ARB_clip_control

In some sense, it’s intuitive for NDC and window-space depth to increase from closer-to-further (near -> far) depths from the eye. But that may just be because I’ve worked with OpenGL for a while.

All that said, these OpenGL conventions originated decades ago. However in modern OpenGL, if you want to use different conventions, you can! ARB_clip_control is part of core OpenGL now. Also observe that modern OpenGL no defines a MODELVIEW and a PROJECTION matrix. So do whatever you want on the shader side with positions, and just provide the position to OpenGL in whatever CLIP-SPACE region you say you’re going to provide it in.

See the ARB_clip_control extension spec for some reasons why you might want to use a different convention. For instance, when rendering to a floating-point depth buffer.

What’s “going on” is that the GPU does calculations. Numbers go in, numbers come out. What those numbers “mean” is largely up to the programmer. Even more so with “modern” OpenGL (3+ core profile and ES) which doesn’t embody certain historical conventions in the specification. Legacy OpenGL enshrines “object space”, “eye space”, “clip space” and “normalised device coordinates” (NDC) in the specification, whereas modern OpenGL only has clip space and NDC; other spaces only exist at the programmer’s choice.

Conventionally, NDC has the positive Z axis “in front” of the viewpoint. In modern OpenGL this convention only exists insofar as the initial state is equivalent to glDepthRange(0,1) and glDepthFunc(GL_LESS), meaning that rendering with depth-testing results in fragments with lower NDC Z values obscuring fragments with greater NDC Z values. But those settings can be changed if you so desire.

In legacy OpenGL, where vertices are automatically transformed by the model-view and projection matrices, the the functions commonly used to construct the projection matrix (glOrtho, glFrustum, gluOrtho2D and gluPerspective) all flip the Z coordinate, meaning that eye space conventionally has the negative Z axis “in front” of the viewpoint, and so is right-handed. Again, this can be changed by constructing a different projection matrix or by changing the glDepthRange and/or glDepthFunc settings. The fixed-function pipeline will work fine if the model-view and/or projection matrix have a negative determinant (i.e. mirroring).