Hello.

I have been studying computer vision for a while and now I am approaching OpenGL for the first time. I am trying to develop a small application, but there is a mismatch from what I studied and what I see in the OpenGL definitions.

Until now, I have been dealing with a so called “camera matrix”, usually indicated with P (ref: Multiple View Geometry in Computer Vision, R. Hartley and A. Zisserman, Cambridge University Press, 2000; and many other books).This is a 4x3 matrix that performs the projection from 3D points to 2D projected points. Since to describe a 3D point in homogenous coordinates we need 4-vectors, and 3-vectors to do the same with 2D points, the P matrix is 4x3.

If X is the 3D point defined as X=(x,y,z,w) and x is its projection on a plane defined as x=(x,y,w), then x=PX.

Now, as far as I have seen in OpenGL there are only 4x4 matrices, defining rotations, reshaping, translations and projections. What is the relationship between the camera matrix P that I’m used to use and the OpenGL matrices? Why also the projection matrix is a 4x4 matrix?

Thank you