the fourth component of opengl vertice coordinates

dletozeun · November 12, 2006, 1:10pm

hello,

I am trying to understand the opengl mechanism of vertice coordinates transformation.
I have read the opengl specification about the matrix, but the explanation about the fourth coordinate of vertice,w is very brief! In fact there is no explanation, just a matrix…

Someone can explain why opengl need four coordinates in a 3 dimensions world?

thank you.

songho · November 12, 2006, 5:24pm

Originally posted by dletozeun:
Someone can explain why opengl need four coordinates in a 3 dimensions world?

The fourth coordinate is used to create perspective look. For example, a straight train railroad becomes narrower while it is further from your eye. Now, consider how we produce such perspective view in computer graphics.

In 3D Cartesian coords with (x,y,z), two parallel lines cannot meet (intersect) each other. Therefore, Cartesian coords cannot be used to create a perspective look.

However, we have a solution by adding additional coordinate, w, and it is called Homogeneous coordinates, (x,y,z,w). With Homogeneous coords, two parallel lines are met at the infinity where w=0. Homogeneous coordinates are fundamental concept in computer graphics, such as projecting a 3D scene onto a 2D screen.

Here is a short proof and why is it called “homogeneous”:
Homogeneous Coordinates

k_szczech · November 13, 2006, 12:22am

A simple example:

rotate 3D vector

 x' = x*cos(a) + y*sin(a) + z*0
 y' = x*-sin(a) + y*cos(a) + z*0
 z' = x*0 + y*0 + z*1
Rotation can be described by 3x3 matrix:
[ cos(a),  sin(a), 0 ]
[ -sin(a), cos(a), 0 ]
[ 0,       0,      1 ]

translate 3D vector

 x' = x + dx = x*1 + y*0 + z*0 + w*dx,   w=1
 y' = y + dy = x*0 + y*1 + z*0 + w*dy,   w=1
 z' = z + dz = x*0 + y*0 + z*1 + w*dz,   w=1
Translation can be described by 4x4 matrix:
[ 1, 0, 0, dx ]
[ 0, 1, 0, dy ]
[ 0, 0, 1, dz ]
[ 0, 0, 0, 1  ]

As you can see, you need the 4th component - w must be euqal to 1 for glTranslate.

dletozeun · November 13, 2006, 9:23am

Thank you both for your reply.

I didn’t know the homogeneous coordinates, I had an approach more “physician” of the persopective in 3D rendering…

I will search more information about this in order to understand how perspective is set up through this coordinate system…
But I don’t understand everything, because in this article: homogeneous coordinates

they don’t speak about 3D coordinates but 2D cartesian coordinates and 2D homogeneous coordinates…
What does the z coordinate become in opengl??

songho · November 13, 2006, 10:43am

dletozeun,
Homogeneous coordinates are a way of representing N-dimensional coordinates with N+1 components. For example, a physical 3D space can be represented with (x,y,z) in Cartesian space which most people are well familiar with, but (x,y,z,w) in Homogeneous coordinates. In a same way, 2D plane is now (x,y,w) in homogeneous coords.

Homogeneous coords makes the perspective mathatically possible in computer graphics.

Search fine art paintings in Renaissance period (with “perspective” keyword). You may find out amazing works with the perspective concept and genius works using irregular viewing frustum. Artists at that time already knew about two parallel lines can intercept. :eek:

dletozeun · November 13, 2006, 12:13pm

Thank you songho…but I think that it doesn’t answer to my question…
you say that (x,y,z) coordinates become (x,y,z,w) in homogeneous coordinates.

Ok but it is the same problem, How whe transform (x,y,z,w) in (xs,ys) the coordinates on the 2D screen?

Here we have to transform 4D coordinates in 2D coordinates!?
I don’t understand why we use 4D coordinates…

In my opinion if I would want to give perspective to the projection of an object on the screen I will do this

(xs,ys) = (x/z, y/z)

(I suppose that the eye is at the (0,0,0) coordinates and it is looking in the z direction toward positive z.)

where (xs,ys) are the screen coordinates and (x,y,z) the coordinate of a point of the 3D space that I have to project.

Maybe I make a confusion…please help me!

Bob · November 13, 2006, 8:34pm

You’re almost there. The screen coordinate is not 2 dimensions, but 3; x and y to determine it’s position on the screen, and z for the depth (used for the z-buffer).

So (x, y, z, w) is transformed by the modelview and projection matrix. This coordinate is then normalized in the perspective division to (x/w, y/w, z/w, w/w) = (x/w, y/w, z/w, 1). (x/w, y/w, z/w) is then transformed by the viewport transform to (sx, sy, sz), which is pixel (sx, sy) with depth value sz.

dletozeun · November 13, 2006, 11:52pm

okkk! Thank you bob now I understand!
I forgot the Zbuffer!