A new challenge we are facing is a way to fit world space geometry and screen space labels (a simple 2D texture) combined.

We have a perfectly working zoom fit for perspective and orthographic projection on geometry and tried to project label corners on camera near plane. The result is not good, because the computations on camera position are made on something that does not change size based on zoom level (the labels).

The questions:

Does a closed mathematical solution to this problem exist?

Shall we accept to do many iterations (fitting the geometry can be slow) to get closer to a perfect fit?

Probably. It would help to define exactly what you mean by “zoom fit”. Which parameter are you changing to obtain the fit? The field-of-view angle or the distance between the viewpoint and the model? The former is simpler, but the latter seems more likely.

If you have an eye-space position (x,y,z), the screen-space position (relative to the centre) will be (k*x/(z+dz), k*y/(z+dz)) where k is proportional to cot(fov/2) and dz is the amount by which you move the viewpoint along the eye-spaze Z axis. If the screen-space offset of the label corner relative to the origin is (ox,oy), then you have
k*x/(z+dz)+ox=tx
k*y/(z+dz)+oy=ty
where tx and ty are the “target” screen-space positions, i.e. the positions of the edge of the viewport. You can solve those for either k or dz (depending upon which you intend to vary). Repeat for each label and take the smallest value.

If you’re varying the view angle (so dz=0), you can add the offsets (scaled by Z) in eye space, find the eye-space bounding box, and calculate the view angle from the bounding box. This replaces division by multiplication, which may be more efficient.

For the case where you’re moving the viewpoint, without the labels you can calculate dz from the eye-space bounding box. With the labels, you can’t do that; as the viewpoint moves farther away from the model, the effect of eye-space Z becomes less significant while the effect of the label offset remains unchanged. You can’t transform the label offset to eye-space because it varies with dz (in a non-linear manner), which is what you’re trying to find.

@Alberto,
You don’t have a problem … unless I miss a point: I’m drawing lots of lables in ‘pixel-space’ with an ortographic projection that are created from the same params as the viewport (screen-pixelwidth, screen-pixelheight …etc.). I realize, that I’ve never tried to draw twize in the same screen-area, where two viewports overlapping each other. I thus cannot tell if it poses a problem or help you in this case. Using the way noted at start is smack-on and in good sync with cursor-position. Using an orthographic projection renders the ‘camera/view’-matrix-stuff souperflous.
… I’m obviously missing a point.