workaround in ARB_vp by MUL'ing by 1?

First off, I’m experiencing this on WinXP/Radeon9800SE/1.4.4145. The problem doesn’t appear on MacOSX/Radeon9800Pro. Don’t have access to NVIDIA boards.

Here’s the gist. The VP addresses xy-offsets for billboards with an ARL instruction. Initially it worked fine, but with minor changes to the VP, the xy-offset appears to be stuck at 0. But in the latter case, there appears to be a workaround whereby multiplying the xy-offset by 1 kicks the value back to what it should be. Now for some details…

In the code below, the 4 xy-offsets (one for each corner of a billboard) are passed via the spriteOffsets' param. Each vertex stores its "corner ID" in thecorner.x’ attrib (vertex.attrib[9]), which is used to define the address register arOffset' with an ARL instruction. The initial version of the VP (which worked fine) used spriteOffsets[arOffset.x] directly in an arithmetic instruction. Among other details, the salient change appears to be that the value of spriteOffsets[arOffset.x] is now copied (MOV) to a temp,xyoffset’.

In debugging the problem, I first noticed that multiplying xyoffset by a constant had no effect, but if I masked the dest register with `.xy’, it all worked. So the following line seems to work around the problem:

MUL xyoffset.xy, xyoffset, 1;

Any thoughts, insights, or similar experiences?

Here’s the complete VP:

!!ARBvp1.0

PARAM animationAlpha = program.local[0];
PARAM zoffset = program.local[1];
PARAM spriteOffsets[4] = { program.local[2…5] };
PARAM defaultColor = program.local[6];
PARAM sizeModulation = program.local[7];
PARAM mvp[4] = { state.matrix.mvp };
PARAM zero = { 0, 0, 0, 1 };

ATTRIB corner = vertex.attrib[9];

TEMP centerPos, cornerPos, nodeSize, color, texcoord, xyoffset;
ADDRESS arOffset;

DP4 centerPos.x, mvp[0], vertex.position;
DP4 centerPos.y, mvp[1], vertex.position;
DP4 centerPos.z, mvp[2], vertex.position;
DP4 centerPos.w, mvp[3], vertex.position;

ARL arOffset.x, corner.x;
MOV xyoffset, spriteOffsets[arOffset.x];
MUL xyoffset.xy, xyoffset, 1; # workaround for bug on WinXP/Radeon9800

MOV nodeSize, 1;
MOV nodeSize.xy, vertex.attrib[10].x;

nodeSize = sizeModulation * nodeSize + 1 - sizeModulation;

cornerPos = nodeSize * xyoffset + centerPos;

result.position = cornerPos + zoffset;

MAD nodeSize.xy, sizeModulation.x, nodeSize, 1;
ADD nodeSize.xy, nodeSize, -sizeModulation.x;
MAD cornerPos, nodeSize, xyoffset, centerPos;
ADD result.position, cornerPos, zoffset;

MUL color, defaultColor, vertex.color;
MUL color.w, color, animationAlpha;
MOV result.color, color;

MOV texcoord, zero;
MOV texcoord.xy, vertex.texcoord[0];
MOV result.texcoord[0], texcoord;

END

Are you setting all 4 components of “spriteOffsets”. (ie. x,y,z and w) for all 4 values?

If not, you could be getting some un-initialized data. (Also note that I think nvidia may initialize some of the variables to 0 if they are not set)

I don’t know what you are doing, but your sprite code looks a little strange. Usually sprites like this are calculated:

  • Get Position into camera space

  • Add the x,y offsets and scaling factors ONLY to x and y of camera space position.

  • Do the perspective projection on camera position and write to the output position.

It looks like you are adding the offset post-perspective transform. (could be correct \for your case - however, be careful as it looks like you could be modifying the z,w components of the output position (maybe with bad data if un-init)

Yes, spriteOffsets[0…3] are initialized as (dx, dy, 0, 0).

I’ve seen examples that, as you suggest, perform the xy-offset in camera (pre-projection) space, then apply the projective transform to the offset vertices. But it wasn’t clear to me that there would be any difference between that and doing it in post-projection space (aside from the fact that the latter saves you from a second transform). It seems the rasterization interpolation may be different, but is that still true given that the sprites are parallel to the projection plane? Wouldn’t w be constant in such planes?

I’ve fiddled around with the above VP, and have found that the “MUL by 1” workaround is no longer needed. Since spriteOffsets[arOffset.x] is only used once, there’s no need to store it in a temp (xyoffset). (An earlier version of the program used it twice, hence the temp to prevent a second load.)

So, when I change the line:

MAD cornerPos, nodeSize, xyoffset, centerPos;

to:

MAD cornerPos, nodeSize, spriteOffset[arOffset.x], centerPos;

the problem goes away. I’m guessing that the temp (xyoffset) was inadvertantly getting optimized away in the MOV instruction that defined it.