From clip space to Normalized Device Coordinate

Hello I was going through the OpenGL Red Book (Chapter 5: Viewing Transformations, Clipping and Feedback). It has been mentioned that homogeneous clip coordinates (x, y, z, w) are divided by ‘w’ such that it leaves visible points between -1 to +1. From my understanding it is working like fitting an object into a canonical bounding box, where ‘w’ is a scale factor. Is ‘w’ found out as follows:

•Step 1: calculate the center (center[0], center[1], center[2])
•Step 2: Find the bounding box coordinates ( Xmin, Xmax, Ymin, Ymax, Zmin, Zmax)
•Step 3: Find ‘w’ = scale/2 where scale = max( Xmax – Xmin, Ymax – Ymin, Zmax – Zmin)
•Step 4: Apply the following transformation to all vertices: (x –center[0])/w, (y –center[1])/w, (z –center[2])/ w

Please let me know whether my calculation of ‘w’ is right.
Thanks in advance

“w” isn’t found, it’s determined by the application. If you use e.g. glVertex4f(), or glVertexPointer() or glVertexAttribPointer() with a size of 4, the w component is part of the position passed to OpenGL from the application. If you supply 3 or fewer components, then the w component is set to 1.

Transforming a 4-vector by a 4x4 matrix uses the w component from the input vector and determines the w component of the output vector. Most transformations leave the w component alone, but a perspective projection produces a vector whose w component is proportional to the z component of the input.

As the conversion from clip coordinates to normalised device coordinates divides by w, setting w proportional to z makes the resulting coordinates inversely proportional to z (i.e. objects appear smaller as they get farther from the viewpoint).

Clipping is performed in clip coordinates, before division by w. Primitives are clipped to the volume defined by -w<x<w, -w<y<w, -w<z<w. In homogeneous coordinates, the clip volume is a hyper-pyramid with its apex at (0,0,0,0) and whose base is the cube with vertices (±1,±1,±1,1). Projecting this volume to Euclidean space gives a unit cube, but clipping is done using homogeneous coordinates (this is required so that other attributes such as texture coordinates, colours, etc are interpolated correctly).

Thanks for your reply. But I’m still a bit unclear. As I understand,converting from (Xclip, Yclip, Zclip) to (Xndc, Yndc, Zndc) requires some scaling factor ‘w’. I don’t understand apart from the way about fitting an object into a canonical box (as I wrote above), is there any other way scale factor ‘w’ can be determined.

Please let me know.

[QUOTE=Lee_Jennifer_82;1279675]Thanks for your reply. But I’m still a bit unclear. As I understand,converting from (Xclip, Yclip, Zclip) to (Xndc, Yndc, Zndc) requires some scaling factor ‘w’.
You can’t convert from (Xclip, Yclip, Zclip) to (Xndc, Yndc, Zndc). Clip space is 4-dimensional; you convert from (Xclip, Yclip, Zclip, Wclip) to (Xndc, Yndc, Zndc) by dividing by Wclip.

Thanks for the reply. I understand about homogeneous coordinates in clipped space, i.e. (Xclip, Yclip, Zclip, Wclip). But my question is how ‘Wclip’ is determined by the system, Is it something different from the way the scale factor is determined when an object is bound to its canonical view volume?

Thanks again.

I understand about homogeneous coordinates in clipped space, i.e. (Xclip, Yclip, Zclip, Wclip). But my question is how ‘Wclip’ is determined by the system,

If you can ask that question, then you don’t really understand homogeneous coordinates.

The mathematics of perspective projection involves division. The idea with homogeneous coordinates is to encode the division as part of the coordinate system. Thus, Wclip is the value you need to divide the other three values by to do perspective projection.

And the reason we do it that way is because it allows us to define perspective projection as a linear transformation in a homogeneous space. And linear transformations can be encoded as a matrix multiplication. So a perspective projection matrix outputs 4 components, with the last being what is needed to complete the perspective projection.

Thanks for the clarification. But I think I couldn’t make you understand what I really want to know. Normalized device coordinate bounds an object within -1 and +1 and this is done by dividing by Wclip. My question is doesn’t whether perspective division ('Wclip" in this case) works like simply binding an object in canonical view volume (as I explained) or not. If not how if differs from that.

It’s part of the vertex position, which is always 4-dimensional.

If you’re using a vertex shader, the vertex position in clip space is the value written to gl_Position, which is a 4-component vector (i.e. it includes a W component). If you try to assign something other than a vec4 to gl_Position, you’ll get an error when compiling the shader.

If you’re using the fixed-function pipeline, then the W component of the object-space coordinates is either provided by the application or is implicitly set to 1 (if the application provides fewer than 4 components). The model-view and projection matrices transform 4-D object coordinates to 4-D eye coordinates to 4-D clip coordinates. The vertex position has 4 components at each stage until after clipping, when the X, Y and Z components are divided by the W component to obtain a 3-D vector in normalised device coordinates.

I’m not confused about the point that clip coordinate is 4D homogeneous coordinate and transformed to 3D cartesian coordinate in Normalized decive coordinate. Please take a look at the uploaded file below which I printscreen from

It shows in NDC, the view volume is centered at origin and within -1 to +1 along x, y, z. This is done directly by dividing by Wclip.
So what my question is, transforming from clip to ndc, works like the way an object is bound to canonical volume. Isn’t it.

The middle image (the diagram for eye coordinates with the “clip coord.” title added) is plain wrong.

You can’t realistically draw diagrams of 4-D spaces. You can get away with it for eye coordinates because the w component is normally 1 (the fixed-function lighting calculations won’t work if the model-view transformation is projective).

If you want to visualise homogeneous coordinates, stick to 2-D, i.e. (X,Y,W). For 3-D, you just have to rely on algebra.

Yes, I think middle diagram is wrong. As it should be bounded by -Wclip to +Wclip, isn’t it. Anyway, my question was that when transformation from clip to ndc is made, Wclip just works like a scale factor, the way in programming we fit an object in canonical volume. Am I right?