# Perspective transformation matrix problem

I want to enable the user to add a point to the screen. With an orthographic projection matrix (with left=-0.5, right=0.5, top=0.5, bottom=-0.5), I simply need to normalise the coordinates (as in step 1) and send them to the vertex shader (step 2):

1. A point is given by the user at (lx, ly). I normalise the coordinates of that point and get (nx, ny)=(lx/screenWidth - 0.5, -ly/screenHeight + 0.5).
2. The normalised point is sent to the vertex shader through a buffer, and with a fixed Z position. Let’s say the point is (nx, ny, Z).

From my understanding, what happens after that is:

1. Vertex shader transformation occurs. We get (a, b, c, w)=MVP*(nx, ny, Z, 1.0)
2. Division by w. We are now in the NDC space, and we have a new point (x, y, z, 1) = (a/w, b/sw, c/w, w/w)
3. That point is then transformed to window coordinates and rendered. Let’s say that point is (sx, sy).

The problem is that I can’t figure out how to do this process properly with a perspective projection matrix. What happens is: the user gives me a point in window coordinates and I normalise it, getting (nx, ny). But when such point is rendered, it’s drawn somewhere else bit due to the perspective distortion. I need some way to, given a point (nx, ny), find another point (wx, wy) and feed that point instead of (nx, ny) to the shader, so that, in the end (step 5), the point rendered to the screen is in the same place as (nx, ny) in terms of window coordinates.

I’ve managed, with some math, to write a (Px, Py, Z) in terms of (nx, ny, Z) and the MVP matrix such that the transformed (Px, Py, Z) equals to (nx, ny, Z) in step 4, but there’s still an offset when rendering occurs, that may or may not be happening from step 4 to step 5 (not sure where/why it’s happening, but I’ve checked that the transformed (Px, Py, Z) to NDC coordinates has x’s and y’s equal to the x’s and y’s of (nx, ny)).

A simple picture illustrating the issue: https://imgur.com/a/M57s0XZ

Can anyone help me finding (wx, wy) given the MVP matrix and (nx, ny, Z)? Any help is appreciated.

The main difference between an orthographic and a perspective projection is that for the orthographic projection, NDC x and y depend only upon eye-space x and y, while for a perspective projection they also depend upon z. Specifically, upon x/z and y/z.

A perspective transformation created with gluPerspective() (or with glFrustum() with left=-right and bottom=-top) will have the structure

``````
[A 0  0 0]
[0 B  0 0]
[0 0  C D]
[0 0 -1 0]

``````

Composing a scale transformation with this (but no translation) won’t change the structure, just the values (of which only A and B are relevant here). Essentially, you get

xc = A * xo
yc = B * yo
wc = -zo
=>
xn = -Axo/zo
yn = -B
xo/zo
=>
xo/zo = -xn/A
yo/zo = -yn/B
=>
xo = -zoxn/A
yo = -zo
yn/B

Note the negative sign because conventional projection matrices negate the Z coordinate, so eye space has the positive Z axis pointing out of the screen while NDC has the positive Z axis pointing into the screen.

So extract the A and B values from the perspective matrix, divide NDC x/y by them, and multiply by the fixed Z value you chose.

Dealing with the viewport transformation is the same for either orthographic or perspective projections.

[QUOTE=GClements;1291308]The main difference between an orthographic and a perspective projection is that for the orthographic projection, NDC x and y depend only upon eye-space x and y, while for a perspective projection they also depend upon z. Specifically, upon x/z and y/z.

A perspective transformation created with gluPerspective() (or with glFrustum() with left=-right and bottom=-top) will have the structure

``````
[A 0  0 0]
[0 B  0 0]
[0 0  C D]
[0 0 -1 0]

``````

Composing a scale transformation with this (but no translation) won’t change the structure, just the values (of which only A and B are relevant here). Essentially, you get

xc = A * xo
yc = B * yo
wc = -zo
=>
xn = -Axo/zo
yn = -B
xo/zo
=>
xo/zo = -xn/A
yo/zo = -yn/B
=>
xo = -zoxn/A
yo = -zo
yn/B

Note the negative sign because conventional projection matrices negate the Z coordinate, so eye space has the positive Z axis pointing out of the screen while NDC has the positive Z axis pointing into the screen.

So extract the A and B values from the perspective matrix, divide NDC x/y by them, and multiply by the fixed Z value you chose.

Dealing with the viewport transformation is the same for either orthographic or perspective projections.[/QUOTE]

Thank you for your reply. I tried implementing what you said but it doesn’t seem to work. Here’s what I’ve done:

I have a point p0 that I got from the user and normalised.

So xn = p0.x and yn = p0.y.
A = m and B = m (I’m using row-major ordering for my matrices).
I have a fixed Z float.

Thus the (Xo, Yo) should be:
(-Z * xn / A, -Z * yn / B).

Did I do something incorrectly in the mentioned part of the code? Or the problem is elsewhere?

I’m a bit confused because if we substitute xn by -Axo/zo in xo = -zoxn/A we get xo = 1

Note that these need to be NDC coordinates, i.e. ranging between -1 and +1 at the edges of the viewport. If these are in the -0.5 to +0.5 range you mentioned earlier, you’ll need to double them.

[QUOTE=WormLice;1291309]
I’m a bit confused because if we substitute xn by -Axo/zo in xo = -zoxn/A we get xo = 1[/QUOTE]

No, you get xo=xo:

xo = -zoxn/A
= -zo
(-Axo/zo)/A
= -(-A
xozo/zo)/A
= (A
xozo/zo)/A
= (A
xo)/A
= xo

Yep, that was exactly what I was doing wrong.

Sorry dumb math moment kicked in.

Thank you very much everything is now working.