GeForce3 display lists compilation slower ?

Eric · June 11, 2001, 11:26pm

Hi there !

I just received my ELSA Gladiac 920 and am testing it with Detonator 12.60 for Win2K.

It seems to me that display lists take a lot longer to compile than on my previous card (ELSA Erazor X2, GeForce DDR).

Is it a known issue with the drivers or is it just me ?

Regards.

Eric

cass · June 12, 2001, 1:13am

Eric,

I’m not aware of any issues with dl compilation. Are you comparing the same
driver version w/ different hardware or different drivers and different hardware? Matt may be able to offer some insight if you provide more specifics.

Thanks -
Cass

IsaackRasmussen · June 12, 2001, 2:07am

I’m in the same situation as Eric. Compiling DL in my prog takes seconds now.
Also with a ELSA Gladiac920, 12.40.

I’m comparing it to,
Diamond FireGL1
3dfx Banshee

Both of them compiled displaylists so fast that you don’t notice it.

Update. I made some some new comparisions and it now looks like it’s rather a typical behaviour of Nvidia drivers than GF3.
Same OS, same driver, testing with GF2 GTS and GF2 MX takes just about the same (long) time.

[This message has been edited by IsaackRasmussen (edited 06-12-2001).]

Eric · June 12, 2001, 2:53am

Cass, thanks for the answer: I thought I could e-mail Matt (or you) directly but I wanted to check if others had found the same behaviour…

I am actually comparing the ELSA Erazor X2 and the Gladiac 920 in the same environment (i.e. same MB, memory, …, and Detonator 12.60 under Win2K).

I don’t really have time to switch back to the X2 (which is already in another computer !) to measure the times but I am pretty sure DL compilation is slower on the new card…

I will try to see if it depends on what I am compiling (the thing happened to me in one of my CFD post-processors and it does 10 million things with OpenGL: I had better trying to narrow the problem down !).

I’ll keep you posted.

Regards.

Eric

Eric · June 28, 2001, 1:16am

OK, here we are.

I am compiling a model that contains:

224848 vertices.
337272 faces.

All the faces are triangles and the code looks something like (actually, it does not look like that at all, but that is what it is doing !) :

float fPoints[224848][3];
int iFaces[337272][3];

glBegin(GL_TRIANGLES);
for (i=0;i<337272;i++)
{
glVertex3fv(fPoints[iFaces[i][0]);
glVertex3fv(fPoints[iFaces[i][1]);
glVertex3fv(fPoints[iFaces[i][2]);
}
glEnd();

When I use the MS implementation, the display list compiles OK.

When I use nVidia’s implementation (GF3, Detonator 12.90, Win2K SP2), the compilation never ends (I mean, after 1mn I end the process…).

I will try to monitor the progression of the compilation (if possible) and post more results here.

Meanwhile, has anyone else noticed that display lists are slower to compile on a GF3 ?

Regards.

Eric

P.S.: I know that rendering a big model this way is not optimum at all…

Eric · June 28, 2001, 3:16am

I have done some more tests.

The compilation runs OK with the occasional freeze in the progression (I guess that happens when I have submitted to many vertices: the driver has to reallocate a larger chunk of memory and copy the batch it already had).

The thing that is taking time is glEndList, as I would expect: when it has received all the information, the driver can start optimizing my triangles…

The thing is, I do not know what the driver is trying to do but it takes forever to complete it…

I am going to try the program on the old card (which is on a newer machine…).

Regards.

Eric

Eric · June 28, 2001, 3:40am

The behaviour seems to be the same on a TNT2 Ultra… I am still waiting for the guy who now has my old GeForce to try my program out.

Could it mean that the Detonator drivers try to optimize batches of GL_TRIANGLES quite a lot ?

Regards.

Eric

Eric · June 28, 2001, 4:13am

OK, I have gone even further…

The thing I am trying to display is a structural model from STAAD (finite elements package).

These models basically give you a lot of lines that describe beams used in the structures.

What my program is doing is creating the actual beams: it takes the two points given in the model and creates a square-section beam using this axis.

So I am displaying N boxes where N is the number of elements.

Each box has got eigth vertices,six faces (twelve triangles). Each face has its own normal.

Now, as I described above, I was trying to display everything with a large single GL_TRIANGLES call.

Just out of curiosity, I tried to group my triangles into strips: each face of each box can actually seen as a strip containing two triangles…

Well, when using GL_TRIANGLE_STRIP the compilation stays at human scale (not turtle… ).

So, is there something major that the driver is trying to do on GL_TRIANGLES that it does not try on GL_TRIANGLE_STRIP ? (is it trying to strip the mesh itself ???).

I must say I had a big performance boost by using strips, which I did not expect: after all I am just sending 2 vertices less per face this way (i.e. 4 instead of 6) but I call glBegin/glEnd for each face !

Has anyone got a clue on that ?

Regards.

Eric

harsman · June 28, 2001, 6:20am

You probably utilise the vertex cache better when you’re using strips.

Eric · June 28, 2001, 6:36am

Originally posted by harsman:
You probably utilise the vertex cache better when you’re using strips.

I don’t know. I thought the fact that my normals are different for each duplicated vertex would somehow flush the vertex cache.

I will try to dig through nVidia documentation to see if there is some information about that…

Regards.

Eric