Optimization Confusion

jalway · May 23, 2006, 1:04pm

Hello,

I’m trying to optimize my code by using a relatively slow card, a GeForce2 MX. While my code zips along in a GeForce4 MX (70+fps), it runs ridiculously slow on the GeForce2, perhaps 1 or 2 fps.

In the process of testing the time it takes for different operations I’m finding that certain OpenGL operations at times take about 1/3 of a second, and at other times take almost no time.

For instance, glGetfloatv(GL_CURRENT_COLOR,val), or glGetDoublev(GL_PROJECTION_MATRIX,…), or glCallLists().

What I find that is if those operations are in three different functions in series, the first one that gets called is the one that takes the most time. After that, rest are very fast, but those functions are bottle necks for some reason.

It’s as if some previous state is affecting the speed of a function, and then once a function is called, the state is back to a faster mode.

Can anyone give me insight into this problem?

Thanks for any feedback,
…John

imported_jwatte · May 23, 2006, 8:57pm

The GeForce2 and GeForce 4 MX are almost identical (mostly the memory interface is different).

I would look for other reasons why that particular GeForce2 is running slow – does another GF2 board run as slow, or is it a bad board? What does GL_RENDERER say about the renderer; is it in software mode perhaps? Is the difference measured on the same machine, or could there be chipset or AGP driver differences?

tamlin · May 23, 2006, 10:43pm

Have you performed a glFlush() before the first of those three calls (note: only to find the bottleneck)? If not, it could be that there are operations in the pipe that needs to finish before getting current color can return.

Could it be so simple that the GF2 has less VRAM, and e.g. texture data is swapped out, or gets swapped out for, operations waiting to be executed before getting current color?

Zulfiqar_Malik · May 23, 2006, 11:36pm

A glGet***() operation can easily “force” the driver to finish the queued operations before returning, and i believe that is what’s causing the slow down. So i think that it is not a problem with that specific call. Maybe you are using a buggy driver or something that’s giving you the extra slowdown on the MX2 card. Have you tried using the same driver for both MX4 and MX2 cards or some other reliable version of the MX2 driver that works well for other people? Its an old card, i think you might have to dig the archives if it is indeed a driver problem.

Madoc · May 24, 2006, 5:24am

Unless strictly necessary (analysis or debugging tool?), avoid querying opengl for state, it’s generally good practice to keep track of the state yourself. If you can avoid those glgets in any way, then I recommend you do so. For the reasons Zulfiqar Malik explained, performance will likely be degraded even if not as drastically as you see on the 2MX. You might find it’s a bottleneck for the 4MX too.

knackered · May 24, 2006, 6:26am

Reading back state values shouldn’t be a problem, as they’re almost certainly cached in the driver as part of the context structure.
It’s just the synchronous reading back of actual data (anything that drops out of the end of the pipeline) that’s a no-no, as far as I’m aware.

Madoc · May 24, 2006, 6:43am

hmm… Given that the current colour might be given by the last vertex in the most recently used VBO or DL (for example), well…

As for querying a matrix, fine (and I guess many people have GL make their matrices for them).

jalway · May 24, 2006, 12:52pm

I checked glGetString(GL_RENDERER), and for some reason it was using a “GDI GENERIC” driver. Why, I have no idea.

I surfed to NVIDIA’s site and downloaded the latest driver. This brought the card up to proper speed.

Thanks for all of the advice and info.

…John

Zulfiqar_Malik · May 24, 2006, 10:56pm

GDI Generic, that means you probably hadn’t installed any official nVidia drivers on the machine. That might be, i believe, the default driver that windows installs .

knackered · May 25, 2006, 2:52am

I hadn’t considered display lists…
Are display lists a driver-side entity, where it just holds vertex/index buffer id’s?