Probably. It would help to define exactly what you mean by “zoom fit”. Which parameter are you changing to obtain the fit? The field-of-view angle or the distance between the viewpoint and the model? The former is simpler, but the latter seems more likely.
If you have an eye-space position (x,y,z), the screen-space position (relative to the centre) will be (k*x/(z+dz), k*y/(z+dz)) where k is proportional to cot(fov/2) and dz is the amount by which you move the viewpoint along the eye-spaze Z axis. If the screen-space offset of the label corner relative to the origin is (ox,oy), then you have
k*x/(z+dz)+ox=tx
k*y/(z+dz)+oy=ty
where tx and ty are the “target” screen-space positions, i.e. the positions of the edge of the viewport. You can solve those for either k or dz (depending upon which you intend to vary). Repeat for each label and take the smallest value.
If you’re varying the view angle (so dz=0), you can add the offsets (scaled by Z) in eye space, find the eye-space bounding box, and calculate the view angle from the bounding box. This replaces division by multiplication, which may be more efficient.
For the case where you’re moving the viewpoint, without the labels you can calculate dz from the eye-space bounding box. With the labels, you can’t do that; as the viewpoint moves farther away from the model, the effect of eye-space Z becomes less significant while the effect of the label offset remains unchanged. You can’t transform the label offset to eye-space because it varies with dz (in a non-linear manner), which is what you’re trying to find.