Could some ATI user try this?

Hi

I need someone with an ATI card who has an app which does multiple passes. All you have to do, is to add 5 lines and test, if your app still does the same thing as it did before. Shouldn´t be much work.

After your app has done at least one of the multiple passes you need to put this in between:

glMatrixMode (GL_PROJECTION);
glPushMatrix ();
glLoadIdentity ();
glPopMatrix ();
glMatrixMode (GL_MODELVIEW);

After that at least one other pass has to be done.

If you now run your app, it should still look as always, because in fact nothing has been done. The projection-matrix has been changed and restored, and nothing should be affected.

On my Geforce 4 this is not the case! After i put this code in between, i got z-fighting in the subsequent passes and there is nothing i can do about this. I wrote to nVidia, but they certainly don´t think this is of any importance.

So, if this works without problems on ATI cards, my next gfx card won´t be a Geforce.

Thanks in advance,
Jan.

Originally posted by Jan2000:
my next gfx card won´t be a Geforce.

NV40 ?

Maybe it is what you do in the passes that’s causing the z-fighting?

No, i tested everything. In fact, if i remove the glLoadIdentity call, then i don´t get z-fighting.
In my engine everything works pretty fine, but when i add these 5 lines, it does not work anymore. That´s doubtlessly a driver or a hardware issue. Therefore i´d like to know, if it makes problems on ATI cards, too. If yes, it might be something, which is not so easy to solve (i am certain the projection matrix is stored in a different precision on the GPU, so maybe this precision gets lost, when the driver tries to store the value).

NV40? Maybe. But nVidia will need a lot of good arguments to convince me for that. At the moment i am really curious to test an ATI card, the hardware seems to be much better.

I get no difference in the test I did… though my test situation may not be beneficial as I just threw the code in the render loop of my lighting code (ppl with volumetric shadows), which has no depth writes enabled and uses GL_EQUAL comparisons

Actually, you should try this on your machine, Jan.

Before you enter that 5-line code segment, call “glGetDoublev(GL_PROJECTION_MATRIX)” and store the results off. After the segment, do the same call, but to a different location.

Print out the two matrices. If they aren’t identical, then the driver is causing the problem. Send nVidia a bug about it.

BTW, question about the GL-spec. Is glPopMatrix guarenteed to restore the matrix precisely as it was before (binary-identical)? I think it should, but I don’t know what the spec says about it.

I had the same problem some time ago: http://www.opengl.org/discussion_boards/ubb/Forum3/HTML/009260.html .

The spec can’t convince me 100% that what you’re seeing is really invalid behaviour. If possible, I would rewrite the code to draw the stuff that needs a different matrix last (that’s what I did).

– Tom

Originally posted by Jan2000:
[b][quote]

glMatrixMode(GL_PROJECTION);
glPushMatrix();
glLoadIdentity();
glPopMatrix();
glMatrixMode(GL_MODELVIEW);

[/b][/QUOTE]

Some things worth checking.

-Matrix stack overflow or some GL error (putting in some glGetError() wouldnt hurt)
-In the next pass you may assume you are still in projection matrix mode while you are not…

Yeah I know that this is obvious stuff but who knows…

Originally posted by Korval:
BTW, question about the GL-spec. Is glPopMatrix guarenteed to restore the matrix precisely as it was before (binary-identical)? I think it should, but I don’t know what the spec says about it.

The spec says it uses IEEE floating point numbers, so if it wouldnt restore them binary-identical, it would violate the IEEE spec for floats’n’doubles.

@1234!: I check gl errors, there are none.
I also am not a friend of push/pops, it´s the only line of code, where i use it, so stack-overflows are impossible.

@DopeFish: My app disables depth-writes and uses GL_LEQUAL, that´s absolutely valid. But do you have a Geforce or a Radeon?

@Korval: I will definitly check that tomorrow, good idea.

Jan.

The spec says it uses IEEE floating point numbers, so if it wouldnt restore them binary-identical, it would violate the IEEE spec for floats’n’doubles.

That’s not the point. The question is, does the spec mandate that a push/modify/pop operation result in a matrix that is binary-identical to the original one, or is it just to “within floating-point round-off error”? In short, is a matrix stack implementation required to store each stack position, such that pops do no math, or can an implementation inverse-multiple to retrieve the old matrix on a pop (and accept the floating-point error you get)?

Well, at least this discussion has shown one thing: I should get to building my matrix stack soon, and abandon GL stacks. At least then, I can guarentee particular behavior.

You absolutely need to be doing your own matrix math. I was going to suggest you save off a copy when calling glLoadMatrix() into the projection matrix, and then just LoadMatrix() the same data again in the future, but if you build it using GL calls, that may be harder.

Originally posted by Korval:
That’s not the point. The question is, does the spec mandate that a push/modify/pop operation result in a matrix that is binary-identical to the original one, or is it just to “within floating-point round-off error”? In short, is a matrix stack implementation required to store each stack position, such that pops do no math, or can an implementation inverse-multiple to retrieve the old matrix on a pop (and accept the floating-point error you get)?

Huh? Inverse multiply which matrix with what? Dude, no offense but you are getting silly. It’s called a STACK for a reason.

Besides from the spec:

The glPushMatrix function pushes the current matrix stack down by one, duplicating the current matrix. That is, after a glPushMatrix call, the matrix on the top of the stack is identical to the one below it. The glPopMatrix function pops the current matrix stack, replacing the current matrix with the one below it on the stack.

I guess its either a driver bug, or a mistake on Jan’s side.

I guess its either a driver bug, or a mistake on Jan’s side.

Since Tom got the error too (in the other thread), I would suspect that it’s a driver bug.

Originally posted by 1234!:
[b] Huh? Inverse multiply which matrix with what? Dude, no offense but you are getting silly. It’s called a STACK for a reason.

Besides from the spec:

The glPushMatrix function pushes the current matrix stack down by one, duplicating the current matrix. That is, after a glPushMatrix call, the matrix on the top of the stack is identical to the one below it. The glPopMatrix function pops the current matrix stack, replacing the current matrix with the one below it on the stack.

I guess its either a driver bug, or a mistake on Jan’s side.[/b]

Imagine that the Gf4 is storing the matrix at position 0 as a double and that when you push the stack, it copies [0] to [1] where [1] is stored as floats.
Then, when you pop, you no longer get the original matrix at [0]

That`s what needs to be tested. Use glGetDouble to make sure all is OK on [0].

Originally posted by V-man:
Imagine that the Gf4 is storing the matrix at position 0 as a double and that when you push the stack, it copies [0] to [1] where [1] is stored as floats.

So you are telling me that somebody would implement a (hardware) stack where an stack entry could consists of floats or doubles? Sure thing!

While we are at it, why would an GF4/FX store matrices with double precision while it can only process float vertices?

You are also forgetting that the duplicate has to be identical as per OpenGL spec.

No dude, I am not convinced by your thoughts.

Originally posted by V-man:
Then, when you pop, you no longer get the original matrix at [0]

Why that? If I “pop” I only change the stack pointer, nothing more nothing less. Even if I have a double/float mixed stack (which would violate the OpenGL spec) it wouldnt change a thing.

Even if the stack is entirely done on the CPU with double precision, and uploaded to the GPU only with float precision it has to give you always the exact same result.

A conversion from double->float always results in the exact same value as it is defined by the IEEE.

In short: The matrix stuff is either entirely done with floats (casting doubles to floats for i.e. glRotated() calls) or done in doubles (converting floats to doubles without loss in precission for i.e. glRotatef() calls)

Originally posted by V-man:
That`s what needs to be tested. Use glGetDouble to make sure all is OK on [0].

I am sure that on current mainstream hardware/implementations its all done with floats and that a glGetDouble only performs a float->double conversion which is rather pointless.

Using your own matrix logic is just a workaround.

After reading Tom’s post I 2nd Korval and declare it an driver bug.

Jan, Radeon9800 pro
You asked for someone with a Radeon to test, so if I was using a GeForce it would have been pretty useless of me :stuck_out_tongue:

Ok, i checked the projection matrix with glGetDouble (…).
The two matrices are absolutely identical. But i assume, that all that is really a CPU <-> GPU issue, and that precision gets lost when the matrices are pushed/popped.

So, either nVidia has a software stack and ATI a hardware stack, or ATI has a software stack, too, but stores the matrices precise to the last bit.

My next gfx card will definitive be an ATI card, i´m fed up with nVidia.

Jan.

There may be issues with some part of your application or the GL driver changing the FPU precision, resulting in different rounding behaviour.

You can monitor this with the following snippet (assuming MSVC).

unsigned short get_fpu_control_word()
{
unsigned short rv;
__asm {
FSTCW [rv]
}
return(rv);
}

Note that there is a seperate control word for SSE/SSE2 operations, this one will only catch x87 FPU.
You can try disabling the usage of instruction set extensions first, IIRC there’s an option for this in the control panel.

I have an Athlon 1.3 GHz (i think that´s 1500+ or so).

I doubt it has SSE or SSE2 support, does it?

Originally posted by Jan2000:
[b]I have an Athlon 1.3 GHz (i think that´s 1500+ or so).

I doubt it has SSE or SSE2 support, does it?[/b]
No. SSE support on AMD started with the AthlonXP line (and Duron >=1GHz).
3DNow doesn’t have precision controls.

Can you test the snippet? Get the fpu control world on program startup and after your matrix problem section, and compare.