Pixel footprint in the world

Admitted, this is not directly a GL question. But I guess, the advanced forum gathers the smartest people… :wink:

Well, what I need: to determine the footprint a pixel in the world at a certain distance to the viewer. This means I want to determine how large a pixel at this distance is (in world or object coordinates). I need the sidelength(s) of it and could use the area also.

I tried to just compute the inverse matrix chain:


V viewport-transform
P projection
C camera transform (world to eye)
O Object transform (object to world)

and then just multiply (1,0,0,0) and (0,1,0,0) (“one pixel”) with it. This didn’t work out when there’s a scaling in either C or O involved.

I ended up with this method:

1.Using X, project three vertices P1(0,0,0,1) P2(1,0,0,1) and P3(0,1,0,1) into objectspace and divide them by w.

2.Calculate dx=P2-P1 and dy=P3-P1.

3.Determine the Lengths ldx of dx and ldy of dy.
This gives us the sidelengths of a pixel in objectcoordinates, placed at the nearplane.
4.Divide ldx and ldy by the nearplane distance

Now ldy and ldy act as factors which I can just multiply with some distance and get a pixel size back at this distance.

Is there some more elegant way to achieve the same? Especially, I’d like to know a method which is completely agnostic of the projection type and can handle any type of scaling…

thanks in advance!

Your initial approach does not seem to wrong. Maybe you messed up somewhere.

But you could also try using gluProject and gluUnproject to get what you want. Basically you could take the the upper left corner of the 3d pixel and gluProject it to screen coordinates. Add the width of a pixel (1, 1) to (x, y) to the screen space coordinates and gluUnproject these back to object space. Now you have two (x, y, z) vectors which give the upper left and lower right corner of the screen aligned pixel at the desired distance.

For performance reasons it might be wise to investigate the formulas and write a specialized solution.

[ www.trenki.net | vector_math (3d math library) | software renderer ]

That method won’t work when there’s rotation, either. It’ll return dx and dy values in the ranges [0…realDX] [0…realDY]

How about the x/z y/z approach, used decades ago in software rendering of simple 3D triangles? (it’s actually xconst1X/z, yconst1Y/z, you precompute const1X and const1Y according to the viewport size and FOV). It won’t support orthogonal projection (which of course is useless to support, you’d better keep a flag “bool IsOrtho”).

Or you could continue the way you started, but just take-out rotation and translation of the camera out of the calculations, and position the object right in front of the camera, just vary the distance. That distance will be equal to length(cameraPos-objectPos), where both cameraPos and objectPos are in world-space. So, only keep the viewport and projection matrices that you use to render, while the model-view matrix is just a translation.
Then, use a custom gluProject for those 2 points, and thus you get the area (after squaring the distance).

Your initial approach does not seem to wrong. Maybe you messed up somewhere.

Well, I guess it would have been too easy then :wink: I think it has something to do with my lacking knowledge on homogenous coordinates, projection and how they relate together.

Your solution with gluUnProject is basically what I’m doing now.

How about the x/z y/z approach, used decades ago in software rendering of simple 3D triangles?

I know too little about software rasterizers. But dx/dz and dy/dz (in Objectspace or Worldspace!) seem to be the numbers I’m looking for.

you’d better keep a flag “bool IsOrtho”

Thats exactly how I do it right now. In order to detect an orthogonal projection, I check the lower left 3 elements of the projection matrix for 0.0. If they all are 0.0, I got an orthogonal projection.

Or you could continue the way you started, but just take-out rotation and translation of the camera out of the calculations, and position the object right in front of the camera, just vary the distance.

This is basically a transfer of the problem into eyespace. It should be solvable in objectspace, too.

Eyespace or objectspace - doesn’t matter (distance is the same). The projection and viewport are what bring different results with the same distance. But also if your camera+object rotation are non-null, like I said - you’ll get invalid results. You can do it your way, but just rotate the second of those 2 points in such a way, that they form a billboard-like line.
I’m just always trying to find ways to get the same results with less wasting of cycles, so I try to discard slower (and perfectly correct) algorithms. Maybe you’re making a proc to choose a LOD of a mesh, and if that’s the case, then you’ll need to squeeze cycles. Though, support for ortho makes me doubt it, and moreover the LOD selection can be done very quickly by just measuring the squared distance between object and camera.
Maybe it can be helpful to know why you need this function, in what environment (why support all projection-settings), and how slow we can let it be.

Indeed, its needed for a LoD level selection mechanism. I need to compute the LoD level for several thousand objects per frame, so the work per object should be fairly lightweight. The current method is very lightweight, just a multiplication of the distance with a factor per object. The computation of the factor is being done once per frame, so it can be heavyweight if needed.

The whole thing should make no assumption on type of projection, because I basically just get a projection matrix handed over. So no information about type, fov, near/far distance etc. is available.

The way I’m doing it now is already close to what I need. I’m just curious, if there’s probably a mathematical “correct” (and maybe even simpler) solution.

I think this is the way LOD is selected in games:

typedef struct{
    float LOD1distance; // distance is squared here!
    float LOD2distance; // distance is squared here!
    float LOD3distance; // distance is squared here!
    Mesh* LOD0mesh;
    Mesh* LOD1mesh;
    Mesh* LOD2mesh;
//    Mesh* LOD3mesh=NULL; // at LOD3distance, the object is not drawn

Vec3 cameraPos; // in world-space
Vec3 objectPos; // in world-space

Vec3 dist1 = cameraPos-objectPos;

float dist2 = dist1.x*dist1.x + dist1.y*dist1.y + dist1.z*dist1.z;

Mesh* mesh = modelLod->LOD0mesh; // this is the LOD

if(dist2 > modelLod->LOD1distance)mesh=modelLod->LOD1mesh;
if(dist2 > modelLod->LOD2distance)mesh=modelLod->LOD2mesh;
if(dist2 > modelLod->LOD3distance)mesh=NULL;

The code doesn’t handle drawing sniper-scopes, there it’s cheapest to simply have a “float inverse_ZoomFactor=1.0” and multiply dist2 with it before those “if(…)”. Also, to support different viewport-sizes, take into account the viewport-height like this:

inverse_ZoomFactor = (viewport_height/1024.0) / ZoomFactor;
(if the artist chose those LOD1distance, LOD2distance and LOD3 distance values in 1280x1024 viewport-size)

Otherwise, I think the mathematically correct way would be just like I said - use the projection and viewport matrices, put the camera at <0,0,dist2> and transform those 2 points with the 3 matrices. (you can optimize it all to use just 1 matrix and only modify the z of both vertices, then multiply the 2 vertices with the matrix).

[ btw, I’m not a matrix-guru, rather a newbie - so I can be completely wrong ]

For an ortho view it’s just the extents to glOrtho/glViewport.

For a proj view:
far_plane_extents = 2 * far_pane_dist * tan(fov/2)
units_per_pixel = depth / (far_plane_extents/view_extents)

Where ‘extents’, and fov, is either horiz or vert (i decide based on aspect).

I calc & cache the extents when i setup the view. This reduces the units_per_pixel calc to a couple of conditions (is ortho|persp, is aspect > 1), and a single division.

When I need pixel size - I call DDX and DDY for it’s world position. So I get pixel size with slope affecting.