Collada Animation - Need help with getting it correct.

Let me start off by saying a thank you to Dark Photon already for his post here:

Need help with skeletal animation - OpenGL - Khronos Forums it’s a few years old but it really helped me.

I have a very similar issue in that I can’t get the correct matrices and apply it to my opengl project which is a basic collada viewer also this is not for homework and I have no training in this I am just doing this for hobby work but I can’t seem to grasp what is happening. I have read many sources and read Dark Photon’s reply many times as it has made good sense but its not working likely due to my misunderstanding of some fundamental concept.

I am going to do my best to explain my issue as clearly as I can using the terms that Dark Photon used (or at least my understanding of them).

So the first thing I am doing is extracting the inverse bind matrices (IBM) from the controller section of Collada file.

Now am I right in saying that for a basic example I could simply inverse the IBM to get the joints world matrix? To me that makes sense perhaps this is wrong, but in these collada files the rest pose will be the same as the bind pose.

Viewing the bone locations and mesh every thing looks good so I assume this part is correct, next I take the Animation Matrices (A) in from the animation section of the collada document.

And I multiply as follows (working on a bone with 1 parent)

G = (ParentBone.JointWorldMatrix * AnimationMatrix) * (ThisBone.JointWorldMatrix * AnimationMatrix)

In the above I am assuming that the animationMatrix is different for both multiplications? As in the multiple the parentbone.JWM by its animation matrix for this frame?

Then I get the Inverse Bind Pose of this bone

IBP = (ThisBone.IBM * ParentBone.IBM)

Then I calculate the Final Transformation

Could I please get a definition for what exactly this is doing?

Is it the transformation from 0,0,0? If not what is it from / to

F = G * IBP

And I get an unexpected result likely due to not fully understanding one for the matrices above or maybe something wrong with the whole process.

Also some other questions:

What is the animation matrix exactly? Is it how to animate the bone in the bone’s local space (bone-space/joint-space)?

Ok. I’m confused what you mean by the term “inverse bind matrix (IBM)”. This could mean one of:

  1. The inverse bind pose transform (B-1[SUB]N[/SUB]). Which takes you from OBJECT-SPACE down to JOINT-SPACE for joint N in the bind pose. Or in math terms where joint A is the root joint: B-1[SUB]N[/SUB] = O[SUB]N[/SUB]-1 … * O[SUB]C[/SUB]-1 * O[SUB]B[/SUB]-1 * O[SUB]A[/SUB]-1

  2. The inverse orientation transform (O-1[SUB]N[/SUB]). Which takes you from JOINT-SPACE for the parent joint of joint N down to JOINT-SPACE for joint N in the bind pose.

Now am I right in saying that for a basic example I could simply inverse the IBM to get the joints world matrix?

If we assume your IBM means “inverse bind pose transform”, then that would give you the bind pose transforms (B[SUB]N[/SUB]). Which take you from JOINT-SPACE for joint N in the bind pose to OBJECT-SPACE (where object-space is the coordinate frame that your animated character is defined in, where the origin is in or near the character).

If we assume your IBM means “inverse orientation transform”, then that would give you the orientation transform (O[SUB]N[/SUB]). Which takes you from JOINT-SPACE for joint N to the JOINT-SPACE of the parent joint of joint N in the bind pose.

In both cases, there’s no WORLD-SPACE involved here. So I’m puzzled by you calling it a joint world transform.

…next I take the Animation Matrices (A) in from the animation section of the collada document.

And I multiply as follows (working on a bone with 1 parent)

G = (ParentBone.JointWorldMatrix * AnimationMatrix) * (ThisBone.JointWorldMatrix * AnimationMatrix)

If we assume your IBM means “inverse bind pose transform”, then…

This doesn’t look right. Remember that the bind pose transforms (what you termed the JointWorldMatrix) are not the orientation transforms for a single joint (e.g. O[SUB]A[/SUB]). They’re the product of orientation transforms all the way up the joint tree (e.g. O[SUB]A[/SUB]*O[SUB]B[/SUB]*O[SUB]C[/SUB] * … ), where A is the root joint.

To make that more clear, let me rewrite your expression above with what it’s equivalent to (if ParentBone is joint A, the root joint):

G = (O[SUB]A[/SUB] * A[SUB]A[/SUB]) * ((O[SUB]A[/SUB]*O[SUB]B[/SUB]) * A[SUB]B[/SUB])

See the problem? What you really want is this:

G = (O[SUB]A[/SUB] * A[SUB]A[/SUB]) * (O[SUB]B[/SUB] * A[SUB]B[/SUB])

HOWEVER, if we assume your IBM means “inverse orientation transform”, and thus inverting that yields “orientation” transforms, then rewriting your expression reveals:

G = (O[SUB]A[/SUB] * A[SUB]A[/SUB]) * (O[SUB]B[/SUB] * A[SUB]B[/SUB])

which is what you want.

So the key question here is what are the matrices you’re calling IBM? If they’re inverse bind pose transforms (B-1[SUB]N[/SUB]), then the above equation is wrong. If they’re inverse orientation transforms (O[SUB]N[/SUB]), then the equation is right.

Then I get the Inverse Bind Pose of this bone

IBP = (ThisBone.IBM * ParentBone.IBM)

As you’ve probably figured out, this is only right if your IBMs are inverse orientation transforms, but not if they’re inverse bind pose transforms. So double-check that.

Then I calculate the Final Transformation

Could I please get a definition for what exactly this is doing?

Is it the transformation from 0,0,0? If not what is it from / to

F = G * IBP

We’re using OpenGL’s matrix multiply convention here (where you read right-to-left). So in step-by-step form, here’s what’s going on:

  1. Start with OBJECT-SPACE in the bind pose (the space where the character was rigged)
    • inverse bind pose transform (aka B-1[SUB]N[/SUB], aka IBP)
  2. Now we’re in the JOINT-SPACE for joint N in the bind pose.
    • global transform for joint N (aka G[SUB]N[/SUB])
  3. Now we’re in the OBJECT-SPACE for the animated pose. Success!

As to what exactly this is doing (when you loop over all the joints, computing F for each), it’s taking the joint skeleton for your character and animating it within the animated character’s local OBJECT-SPACE. As I mentioned, this OBJECT-SPACE is a coordinate frame where the character is typically near the coordinate origin.

What is the animation matrix exactly? Is it how to animate the bone in the bone’s local space (bone-space/joint-space)?

Yes.

Thank you for the detailed reply.

By IBM I am indeed referring to inverse bind-pose matrix sorry for the confusion in Collada spec I believe this is labeled as IBM or INV_BIND_MATRIX.

I think I have also assumed used object space and world space wrongly in my original post. I was thinking they were more or less the same thing, if the character was rigged at origin.

So if I am reading your post correctly and also what I understand of the Collada spec my real question is, if I only have the inverse bind pose transform how do I get the inverse orientation transform so that I can do the correct calculation. Or is there a different equation needed?

Ok.

I think I have also assumed used object space and world space wrongly in my original post. I was thinking they were more or less the same thing, if the character was rigged at origin.

I see. For a good diagram that illustrates, see this link:

except here they call OBJECT-SPACE “LOCAL-SPACE” instead.

Also, the OpenGL Programming Guide has a great chapter that covers this (or at least it did in the versions I have).

In OpenGL coordinate systems, OBJECT-SPACE is the space where each object (or group of objects) is at or near the origin. So what you’re describing is this space. WORLD-SPACE is the space in which all objects (including the camera) are placed relative to each other in a shared coordinate system. And EYE-SPACE is the space where the camera (the eyepoint) is at the origin.

So if I am reading your post correctly and also what I understand of the Collada spec my real question is, if I only have the inverse bind pose transform how do I get the inverse orientation transform so that I can do the correct calculation. Or is there a different equation needed?

Assuming your understanding of Collada is correct, you can extract the individual inverse orientation transforms (O-1) by multiplying an inverse bind pose transform for joint N and the bind pose transform for its parent. You can get the latter by inverting its inverse bind pose.

That said, I have no experience with how Collada has structured their skeletal transforms. But I’d find it surprising if they expected you to do all this to extract key transforms needed to compute your skinning matrices. It just seems silly and inefficient.

So you might investigate whether what they’re calling joint “animation” transforms are actually joint global transforms (G). That might make more sense because then you’d just post-multiply those by the inverse bind pose transforms (B-1) to get the skinning matrices.

If I say that the animation matrix given is the Global Transform Matrix (G) and do

AnimationMatrix * ThisBone.InverseBindMatrix * ParentBone.InverseBindMatrix

Or in the way you wrote it

G * B-1[SUB]1[/SUB] * B-1[SUB]0[/SUB]

I get the wrong result but it is almost correct.

This is what happens in my simple test if it sheds any light.

I have an arrow pointing towards me(Z), the arrow is made of two bones, one for the shaft and one for the head

The head is a child of the shaft

The shaft rotates along Z say 70 degrees once done the head rotates the reverse clearing the rotation

What I am seeing by doing the above is the shaft rotates correctly and then the head rotates along the correct axis but does not clear the rotation instead adds to the parent. So I would have a 140 degree rotation.


By doing

O = B-1[SUB]1[/SUB] * B[SUB]0[/SUB]

And

F = (O * A) * (B-1[SUB]1[/SUB] B-1[SUB]0[/SUB])

I get the right rotations but the arrow head is not offset by the parent so its position is half way down the shaft

That makes sense as I didn’t get the parent’s Orientation Matrix. If the parent is the root bone would the orientation matrix just be the Inverse Bind Pose Matrix?

[QUOTE=BlueStreak;1284646]If I say that the animation matrix given is the Global Transform Matrix (G) and do

AnimationMatrix * ThisBone.InverseBindMatrix * ParentBone.InverseBindMatrix

Or in the way you wrote it

G * B-1[SUB]1[/SUB] * B-1[SUB]0[/SUB]

I get the wrong result but it is almost correct.[/QUOTE]

That’s good. Because this math seems close to but not exactly what I was suggesting.

At the end of my message, I was suggesting that if we assume:

  1. joint “animation” transforms are actually joint global transforms (G), and
  2. the inverse bind matrices (IBM) are inverse bind pose transforms (B-1).

Then what you’d want is:

ThisBone.AnimationMatrix * ThisBone.InverseBindMatrix

which would be:

G * B-1

By doing

O = B-1[SUB]1[/SUB] * B[SUB]0
[/SUB]

No, I believe that would give you:

O-1[SUB]1[/SUB] = B-1[SUB]1[/SUB] * B[SUB]0
[/SUB]
Write it out with the B-1[SUB]1[/SUB] and B[SUB]0 [/SUB]terms expanded to check this (that is: O[SUB]B[/SUB]-1 * O[SUB]A[/SUB]-1 * O[SUB]A[/SUB] = ?). I think you may instead want:

O[SUB]1[/SUB] = B-1[SUB]0[/SUB] * B[SUB]1[/SUB]

That makes sense as I didn’t get the parent’s Orientation Matrix. If the parent is the root bone would the orientation matrix just be the Inverse Bind Pose Matrix?

No. Remember that inverse bind pose is the product of the “inverse orientation transforms” walking down the skeleton from root. By contrast, the bind pose is the product of the “orientation transforms” walking “up” the skeleton “toward” root.

So, the orientation matrix for the root joint/bone is its bind pose matrix, not its inverse bind pose matrix.

Thanks for your help.

If I wanted to take the animation matrix and remove the bone’s rotations from the bind pose so that each time X is left right, Y is up down and Z is in out. How would I achieve this?

I can see that the animation matrix is the transformation from the local space of the bone in the bind pose.

So how would I remove the local space translations of the bone?

I confess I’m not sure exactly what you want to do here.

Are you talking about, if we assume each Collada animation matrix is a joint global transform (G) and the Collada IBM is an inverse bind pose transform (B-1), how would you extract just the animation transforms (A[SUB]N[/SUB]) from the Collada animation matrices?

I can see that the animation matrix is the transformation from the local space of the bone in the bind pose.

Ok, so it transforms “from” JOINT-SPACE. Do you know yet what space it transforms “to”?

Dark Photon I have read something that has brought new light to the issue with collada.

So I think with collada the animation matrix is a new bind pose basically, rather than a translation from the bone space, I was wrong many times in this thread and apologies for the confusion.

So I have more of the facts right now and i have got it much closer, I have a few issues left which I am trying to work through now

Related:

See knight666’s description, code, and procedure for drawing the joint skeleton given what COLLADA provides.

Also, I found his lead-in comment humorous and useful: