Object-to-pixel coordinate transformations

Hi to all,

i hope someone can help me with my problem.

Suppose that i have a transformation matrix R, a vector T and that i use R and T to model the object->eye transformation of a point U in object space:

V = R*U+T

Suppose that i have four real number Fx,Fy,Cx,Cy. They represents the intrinsic videocamera parameters:
-Fx = focal length * dimension of pixel in horizontal direction
-Fy = focal length * dimension of pixel in vertical direction
-Cx,Cy = coordinates of the principal point

R and T are computed solving the camera pose estimation problem.

In order to obtain the pixel coordinates and render a cube (the cube is rendered (or “registered”) perfectly on a planar rectangular target viewed with the videocamera; i print the target on simple hard paper), i do the following:

(eq 1)
x = -Fx*(V_x/V_z)+Cx
y = -Fy*(V_y/V_z)+Cy

I then use custom made routines to draw lines in order to render a wire-frame cube.

Now i would like to switch to opengl, but i am facing a lot of problems.
First, the projection matrix. I have studied the opengl coordinate transformation pipeline in order to understand how should i setup the projection matrix. To obtain the equation (1), i do the following:

the frustum planes are defined as:
r=right plane
l=left plane
t=top plane
b=bottom plane
f=far plane
n=near plane
the screen width is w, the screen height is h. The viewport
is set at (0,0) of width w and height h. The center of the
viewport is (w/2,h/2)=(ox,oy).

Ok,now let

r-l=(wn)/Fx
t-b=(h
n)/Fy
r+l=(1-(2Cx)/w)((wn)/Fx)
t+b=(1-(2
Cy)/h)((hn)/Fy)

I setup the projection matrix in the standard way, using n=0.1 and f=1000.0 and the values above:

glViewport( 0, 0, 320,240);
glMatrixMode( GL_PROJECTION );
float intrinsic[16];
float near, far;
near = 0.1;
far = 1000.0;
memset(intrinsic, 0, sizeof(intrinsic));
float rml = (320.0near)/Fx;
float tmb = (240.0
near)/Fy;
float rpl = (1.0-(2.0Cx)/320.0)((320.0near)/Fx);
float tpb = (1.0-(2.0
Cy)/240.0)((240.0near)/Fy);
intrinsic[0] = (2.0near)/rml;
intrinsic[5] = (2.0
near)/tmb;
intrinsic[8] = rpl/rml;
intrinsic[9] = tpb/tmb;
intrinsic[10] = - (far + near)/(far - near);
intrinsic[11] = -1.0;
intrinsic[14] = - (2.0 * far * near)/(far - near);
glLoadMatrixf(intrinsic);

After all the tranformation, the pixel coordinates should be exactly (or at least very near to) the same as those that obtained with equation (1).

But they are not!!!

I must say that i have to do the following in order to setup
the modelview matrix:

float glR[16];
glR[12] = T[0];
glR[13] = -T[1];
glR[14] = T[2];
float S[3] = {-1, 1, -1};
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
glR[j*4+i] = R[i][j] * S[i];
}
}

I then load the modelviewmatrix :

glMatrixMode( GL_MODELVIEW );
glLoadMatrixf(glR);

I don’t know why i have to put -T[1] (negate the y component
of the translation vector) and negate the first and last row
of the matrix to obtain good result.

I say good because the cube rendered does not coincide with that rendered with my custom routines. The discrepancy is such that is not a problem of line-drawing precision, but is a problem of coordinate transformation or something. The cube seems to stay too high on the paper, but if i translate the object up or down i do not get the results.
Ah, the cube have the lower face at Z=0, the upper face at Z=0.5 (i have tried Z=-0.5). so the lower face must be on the planar target, but it does not!!!

I really need your help!!!

Best regards,
Luca

If understand you, you have video feed and you want to place 3d object on it, but also you want to calculate camera position based on same video.

First, have you consider that opengl coordinate system is bottom-up (just like in math) and origin is in bottom left corner.

Second, modelview mtrix is composition od model and view matrix. In orde to get any result, first you must place your camera (view) then mult with model transform.

Third, try to use gluPerspective call. It takes fovY, aspect ratio, near and far.

When you calculate your camera matrix in what form it is… is it just a matrix or it is position + direction + up-vector? There is a gluLookAt call which can help you to setup correct view matrix.

Yes, i want to place a 3d object on a planar target, seen by a videocamera (the videocamera of my N73, for example). This is the registration problem in augmented reality.

What i want to know is how exactly opengl es transform a vertex v=(0.5,0.5,0.5) (for example) to obtain pixel coordinates. I need the exact transformations (signs, constant, matrix-vector multiplications…), because otherwise i will not be able to render correctly the object.

Here follows a more detailed discussion.

Ok, suppose that i load a modelview matrix with glLoadMatrix()
and that a i setup a projection matrix (after passing to glfrustum the near,far,right,left,top and bottom planes).
Suppose that i setup a viewport like glViewport(0,0,w,h).

I have the vertex v=(0.5, 0.5, 0.5), what will be the pixel coordinates? how do opengl es computes it?

From what i have understood, opengl es do the following:
v’=M*v,
where M is the modelview matrix M=[m_ij], with i=0…3, j=0…3
(a 4x4 matrix).

First, i would like to know how opengl es muliply M by v (matrix vector multiplication)…

After this, there is the transformation:
v’’ = P*v’

where P is the projection matrix (as defined in the opengl es reference).
After this, there is the division by w (normalized device coordinates):
v’’ = (x,y,z,w), v’’’ = (x/w,y/w,z/w)=(x’’’,y’’’,z’’’)

After this there is the viewport transformation:

x_pixel = (w/2)* x’’’ + w/2
y_pixel = (h/2)* y’’’ + h/2
z_pixel = .5 * z’’’ + 0.5 (used for z-buffering)

In my application, the matrix that transform an object from world to eye (camera) coordinates (suppose the object to world transformation is the identity rotation matrix with a zero translation vector) is obtaind after solving some equations. Let it be R and T.
In order to render the 3D object correctly, i do the following:
v’=R*v+T=(x’,y’,z’)
after this, i obtain pixel coordinates:

(1)
x = -Fx * x’/z’ + Cx
y = -Fy * y’/z’ + Cy

Generally speaking, we have the following equations:

x = ax’/z’ + b
y = c
y’/z’ + d

In my application, a = -Fx, c = -Fy, b = Cx and d = Cy.
In Opengl es we have:
a = -(wn)/(r-l)
b = -(w/2)(r+l)/(r-l) + w/2
c = -(h
n)/(t-b)
d = -(h/2)(t+b)/(t-b) + h/2

So, we have to set
r-l = (wn)/Fx
t-b = (h
n)/Fy
r+l=(1-(2Cx)/w)((wn)/Fx)
t+b=(1-(2
Cy)/h)((hn)/Fy)

to get equivalence between my application and opengl es.

If the Opengl Es do what they say to do, the final pixel coordinates would be very near to that compute with (1), but they are not! I need to negate the first column and the third column of the opengl matrix and to negate the y-part of the translation vector. After this, the rendering is slightly wrong, and this problem is evident (it is not a product of some truncation error…). The object seems to be rendered on the planar target, but his position is slightly wrong…

On first glance your modelview matrix is incorrect. You never initialise the last row, it should be set to (0 0 0 1).

Before i set it, i do
memset(glR, 0, sizeof(glR));
glR[15] = 1.0;

OpenGL projection work this way… for a given:

  1. modelview (MV) and projection § matrix, both mat4x4
  2. viewport (vport), array of 4 integers
  3. point coordinate on model, (vec4 p)… (x,y,z,1)

Compute mvp = MV * P;
Transform point: tp = mvp * p;
float oow = 1.0/tp.w;
tp.x *= oow;
tp.y *= oow;
tp.z *= oow;

Apply viewport tranformation:
out.x = vport[0] + (1 + tp.x) * vport[2] / 2;
out.y = vport[1] + (1 + tp.y) * vport[3] / 2;
out.z = (1 + tp.z) / 2; // z is range 0…1

out.xyz is screen coordinates or better, window coordinates, of point p on model, projected using current modelview and projection matrix.

Uh! I think the order should be: PMVp. First, you transform from world to camera coordinates, then you apply the projection matrix…Are you sure the order is MVPp?

Small correction: it’s P * MV.

Ok, i can’t get it to work! It is simple math, but it does not works!

The black hole of the LHC is already here?!?!

Sigh!