Vertex cache in GeForceFX

Galstaff · January 14, 2004, 1:31am

Hi

I’m having troubles with moving my code from ATI Radeon 9000 card to my new GeForceFX 5600. I was unable to determine the size of vertex cache, my simple test shows that there’s no post-TNL caching at all, but that’s an absurd, isn’t it? As a result, vertex processing speed is amazingly slow, I can push only 11 mtri/sec against 30 mtri on my old Radeon. Does anyone have similar difficulties?

Jan · January 14, 2004, 1:37am

How exactly do you feed the vertices to the gfx card?
I´m assuming you use indexed vertices with glDrawRangeElements maybe even with VBO. In that case everything should work fine.

Also you have to be sure, that you are transform-limited, else your test won´t make any sense.

Jan.

Galstaff · January 14, 2004, 2:35am

I’m using VBO with single DIP call. And it’s definitely not fillrate-limited.

So you think the hardware is OK? I was thinking maybe NVIDIA removed hardware TCL from GeForce compeletely since 5600 is not high-end videocard.

[This message has been edited by Galstaff (edited 01-14-2004).]

Jan · January 14, 2004, 8:24am

A Geforce FX 5600 is a high-end card if i am not totally wrong.

Removing the post t&l cache would be a very very stupid thing, therefore i am quite sure nVidia would never do that.

But what´s a DIP call? Don´t know what you mean with that “shortcut” (?).

Jan.

imported_jwatte · January 14, 2004, 8:39am

Hardware TNL is not gone from any GeForce products. It’s ATI that did that with the Radeon 7000 and 9100 IGP…

Maybe you’re not vertex transfer bound, but bound somewhere else? Maybe you’re off the fast path for some reason? For example, Index buffers should be in system memory for GeForce cards.

Try running VTune (if you’re on Intel CPU, else AMD’s profiler) and see where the hold-up is. Perhaps you’re spending time copying or converting data somewhere in the driver?

Galstaff · January 15, 2004, 12:32am

Ok guys, thanks for help. So your guess is that my code is inadequate for GeForce.

Jan, DIP is abbreviation of DrawIndexedPrimitive.

JWatte, VBO is not that low-level as VAR, so it’s driver’s decision where it’s going to put my IB.

Ysaneya · January 15, 2004, 2:01am

Double-check:

size of your vertex format, strange combinations of vertex formats, using non-standard types (shorts, bytes?)
alignment issues
is the vbo static or dynamic ? In the second case, are you mapping the buffer ? then be sure you don’t read from the mapped memory, only write to it, and only sequentially, without “holes”.
size of the vertex buffer, number of vertices/indices rendered per call ? Maybe it’s too high, or too low…?

DIP is Direct3D language specific, not everybody’s familiar with it on this forums

Y.

[This message has been edited by Ysaneya (edited 01-15-2004).]

Korval · January 15, 2004, 2:30am

I was unable to determine the size of vertex cache

I’m concerned about this. How is it that you are attempting to determine the size of the post-T&L cache? Perhaps you are using VBO’s in a fashion that worked fast on ATi cards, but does something odd on nVidia cards.

Also, you are using the most current FX drivers, right?

Lastly, I’m not certain, but I’m not sure that nVidia’s VBO implementation is as fast as using VAR yet. I seem to recall reading that somewhere on this forum, but I’m not entirely positive. If it turns out that VBO’s currently aren’t as fast as VAR (which is as fast as the hardware can go), then you can ignore it and wait until nVidia irons out their VBO implementation.

jeremyz · January 15, 2004, 8:44am

There’s definitely a post T&L cache on the GeForceFX 5600. If you post your code, maybe sombody with be able to help you out.

Obli · January 17, 2004, 8:57am

Originally posted by jwatte:
Index buffers should be in system memory for GeForce cards.

Ugh, I didn’t know that.
Does it make a sense to write a piece of code to understand where to put index buffers? If yes, how this could be done?

imported_jwatte · January 17, 2004, 9:09pm

If you’re using VertexBufferObject (ARB_VBO) and specify that the buffer is for indices, the driver will put it in the optimal place for each card.

system · January 18, 2004, 10:49am

What about VAR2?
You can put indices into AGP with that.