OpenGL: camera to world matrix needs inversion about z-axis. Why?

To move the camera around I have 4 transforms:

  • one to rotate the cam around x
  • one to rotate the cam around y
  • one to translate the cam in the x/y plane perpendicular to the camera
  • one translation along the z axis to move the camera along that plane.

In code (in the vertex shader):

cam_to_world = rotatey(roty) * rotatex(rotx) * translate(vec3(tx, ty, 0)) * translate(vec3(0, 0, zoom));

The thing is that this doesn’t work: a camera moved 5 units away from the origin along the positive z-axis doesn’t see an object centred at the origin unless I apply a z-scale of -1 right at the start.

mat4 scale_z = mat4(1);
scale_z[2][2] = -1;
cam_to_world = scale_z * rotatey(roty) * rotatex(rotx) * translate(vec3(tx, ty, 0)) * translate(vec3(0, 0, zoom));

I don’t understand why?

The only explanation I could find is that by default, since opengl is right handed translating the camera 5 units along the positive z-axis for instance (lets ignore rotation and other translations in x or y) would move the camera away from the origin indeed, but the camera looks away from that origin. In other words to look at an object centred about the origin, we need to rotate the camera 180 degrees about the y-axis or scale its z axis by -1.

Could someone confirm or explain please.

Full code given below:

mat4 rotatex(float theta)
    mat4 rot = mat4(1);

    rot[0] = vec4(1, 0, 0, 0);
    rot[1] = vec4(0, cos(theta), sin(theta), 0);
    rot[2] = vec4(0, -sin(theta), cos(theta), 0);

    return rot;

mat4 rotatey(float theta)
    mat4 rot = mat4(1);

    rot[0] = vec4(cos(theta), 0, -sin(theta), 0);
    rot[1] = vec4(0, 1, 0, 0);
    rot[2] = vec4(sin(theta), 0, cos(theta), 0);

    return rot;

mat4 trans(vec3 tr)
    mat4 rot = mat4(1);

    rot[3] = vec4(tr, 1);

    return rot;

mat4 persp()
    #define M_PI 3.14159265
    float fov = 70;
    float near = 0.1;
    float far = 100;
    float image_aspect_ratio = 1.0;
    float angle = tan(fov * 0.5 / 180.0 * M_PI);
    float right = angle * near * image_aspect_ratio;
    float left = -right;
    float top = angle * near;
    float bottom = -top;

    mat4 persp = mat4(1);

    persp[0] = vec4(2 * near / (right - left), 0, 0, 0);
    persp[1] = vec4(0, 2 * near / (top - bottom), 0, 0);
    persp[2] = vec4((left + right) / (right - left), (top + bottom) / (top - bottom), -(far + near) / (far - near), -1);
    persp[3] = vec4(0, 0, -2 * far * near / (far - near), 0);

    return persp;

void main()
    mat4 P; //<! the perspective matrix
    mat4 V; //<! the camera-to-world matrix
    mat4 M; //<! the object-to-world matrix

    // This order gives you Maya-like control
    // rotate the camera around the y axis
    // rotate the camera around the x axos
    // move the camera z & y
    // move away from the origin (zoom)
    mat4 scale_z = mat4(1);
    scale_z[2][2] = -1;
    V = scale_z * rotatey(roty) * rotatex(rotx) * trans(vec3(tx, ty, 0)) * trans(vec3(0, 0, zoom));

    P = persp();

    M = mat4(1); // use the idendity matrix in this example

    vec3 eye = vec3(0, 0, 10);//vec3(V[3][0], V[3][1], V[3][2]);
    vec3 s = normalize(eye - vp); // position of the eye - vertex pos
    shade_col = vec3(max(0, vp[0]), max(0, vp[1]), max(0, vp[2]));//vec3(0.18 * abs(dot(s, vn)));

    // we need to take the inverse cam-to-world matrix because what we need here
    // is to express points in world space into camera space. Thus we need to apply
    // the world-to-cam matrix to the vertices in world space. After this transformation
    // we can apply the perspective transformation matrix.
    gl_Position =  P * inverse(V) * M * vec4(vp, 1); // our final obj-to-world * world-to-cam * persp vertex transform

Not related to your problem, but is there any reason you’re doing this in shader code rather than in your application code? Let’s say you’re drawing 100,000 vertices. Do this in shader code and you need to calculate these matrices 100,000 times. They don’t change per-vertex though, do they? So calculate them once per-frame in your application code and upload them as uniforms.

mhagain is right, you should move your code in the app and pass the result matrix to the shader.

A part from that this link should expain everything.

The trick is that all the vertexes must end into a cube of side 2 centered in the origin.
The projection matrix will transform the frustum (in the origin) into a cube, the camera transformation need to transform the camera view point into the origin.
So the camera transformation is an inverse transformation. When you move the camera 10 unit on the right, you are not moving the camera on the right, you are moving the world 10 unit on the left. When you are rotating the camera upward, you are actually rotating the whole scene downward. Well I hope the link make it more clear. :slight_smile:

Also note the function rotate_x and rotate_y. One is a clockwise rotation, the other one is anticlockwise.

I know, and appreciate the comment. I have only done this as an exercise.

I also know about the projection matrix (remapping to the unit cube) and the what the world-to-cam matrix does.

What I’d like to know, is why is the scale_z is necessary in building the cam-to-world matrix.

Thank you.

The answer is: YOU DON’T.

I made a mistake. I was simply drawing the back face of the geometry before the front faces. I just didn’t pay attention to the order in which the vertices were declared. Sorry everyone but thanks for your contribution.