# Modelview matrix pairing

Why is the model matrix often “paired” with the viewing matrix in tutorials and examples, such as?

``````
gl_Position = proj_matrix * (mv_matrix * in_vertex);

``````

Would there be something wrong, if I were to “pair” the projection and viewing matrices?

``````
gl_Position = proj_view_matrix * (model_matrix * in_vertex);

``````

The reason I’d like to do this is, that the model matrix varies across models, but the viewing and projection matrices generally stay the same. I could multiply those 2 on the CPU and upload their product to the shader program objects. So why is the modelview matrix cannon in books?

Would there be something wrong, if I were to “pair” the projection and viewing matrices?

Yes, there is.

Also, it’s sometimes useful (required when doing deferred-style rendering) to reverse-transform from window space to some form of pre-projection space. Going to camera space is a lot easier than going to world-space, since you can isolate the zNear/far values from a pure perspective matrix and make the computations much simpler.

``````
viewing_matrix = M' * T(-view_origin)
model_matrix = T(model_origin) * M''

``````

you get:

``````
viewing_matrix * model_matrix = M' * T(-view_origin) * T(model_origin) * M'' = M' * T(model_origin - view_origin) * M''

``````

What if I “premultiply” T(-view_origin) * T(model_origin) on the CPU (actually add 2 integer vectors together) and put the result in the translation column of the model matrix. The model matrix has to be updated for every model anyway. We had a discussion about this over in the maths forum once. The view origin would then stay “fixed” at 0 and models would move closer or farther to it. The M’ matrix would serve as the viewing matrix and be premultiplied with the projection matrix on the CPU. Models too far away would be culled in some way, maybe contribution culling, or some other cull. The could also get clipped, if there was no cull. Integers lose no precision.

The alternative are matrix multiplications on the CPU or GPU to form the modelview matrix.

Why not just upload the znear/zfar values along with the matrix? It could be a part of the same uniform block. 2 more floats is a big deal?

I don’t understand what you’re saying. What is “T(…)” and what are the “M’” values?

Why not just upload the znear/zfar values along with the matrix? It could be a part of the same uniform block. 2 more floats is a big deal?

Because they’re already part of the matrix. The only reason you can’t use them with what you’re doing is because you composed that matrix with another, thus destroying the values. To put it another way, if you were doing things right, you’d already have them.

And it’s not the zNear/zFar specifically that are important; it’s the Z row of the perspective projection matrix that matters.

Yes, there are in fact ways to work around all of these issues. But why over-complicate things? Isn’t it much more simple to just transform from model to camera-space, do your lighting there, and go to clip-space? It’s easier for everyone to understand what’s going on. It’s easier to explain. It’s easier to test and debug and work with.

In general, you don’t gain anything by doing what you suggest. Your perspective projection matrix is usually a lot more constant than your camera matrix. Whereas your model-to-world transforms are likely to change on a frame-to-frame basis, as is your world-to-camera transforms. If your scene is animating in any significant way, then you’re going to need to update the model matrices every frame.

If the camera moves along with them, you update exactly one matrix. If we did it your way, every time the camera moves, we’d have to update two matrices. I’ll be generous and assume that you’re storing this data in uniform blocks (because if you’re not, then the camera+perspective matrix must be updated in every program you use).

Is it possible that, for your particular needs, you could get something? Yes. If your camera is never animated and is more often than not fixed, you could gain something out of this. You would get exactly one less CPU matrix multiply for every object (the multiply with the camera matrix), and one less vector/matrix multiply for each light.

Somehow, I don’t think you’ll notice any performance difference from those. This smells of premature optimization.

Ugluk, with all respect to what Alfonse said, you should try what you have suggested. Only you know how your code looks like, and only you can profile your code. Everything what you said is reasonable. You have just one proj-view matrix calculation per frame, but maybe thousands of view matrices. So, just keep going, and be aware of, potentially existing, calculation imprecision. Profile your application and fairly judge what is better. If you are using Windows platform and NV/AMD card, maybe GLProfiler could help.

I don’t understand what you’re saying. What is “T(…)” and what are the “M’” values?

I decompose the modelview and the viewing matrices into a product of matrices M (some arbitrary matrix) and T, a translation matrix.

If we did it your way, every time the camera moves, we’d have to update two matrices.
You would get exactly one less CPU matrix multiply for every object (the multiply with the camera matrix), and one less vector/matrix multiply for each light.

The rationale then is, matrix upload is a greater evil than matrix multiplication.

The rationale then is, matrix upload is a greater evil than matrix multiplication.

Except that doing what you suggest needs this decomposition step, or you risk the perils of world space.

Odds are neither one is going to show up as anything more than a blip on your profiling data. You will have much lower hanging fruit than your matrix stuff, so you’ll have plenty to do before worrying about the performance of this. But if I were to pick a default, it would be the one that is both simpler mathematically and requires fewer OpenGL state changes.

So that’s the answer to your question, “Why is the model matrix often ‘paired’ with the viewing matrix in tutorials and examples, such as?” You can do it however you want for your code. But this is the default because it’s easier to explain and understand, and requires fewer state changes.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.