Six Million Vertex GL_LINE_STRIP

I’m a graduate CS student working on a bioinformatics project. The program takes a nucleotide sequence of bases (A,G,C, or T) and uses an algorithim to assign each base a 3D vertex coordinate. We then use GL_LINE_STRIP to plot all of the vertices in order to create a 3D plot of the sequence.

When only rendering 15-20 thousand vertices the program works perfectly. The problem is when I attempt about 6 million vertices, I just can’t get smooth animation when moving the plot around. It is usuable, but I figure I should be able to get smooth animation when all I have is vertices.

I have shading set to GL_FLAT. I have divided the sequence into Display lists of 15000 calls to glVertex3f() apiece and I using one call to glCallLists() to execute them. The lists are being created as GL_COMPILE.

I am trying to optimize the program for the nvidia driver set and am currently working on a P4 machine with a GeForce 3 running redhat 7.1 I have properly installed the actual nvidia drivers and have true hardware acceleration.

I have a lot of CS experience but not as much OpenGL or graphics. I am open to suggestions as to the optimum way to handle this. In particular I think that maybe I need to make sure that I’m not wasting time rendering parts of the sequence not in the viewing area. Given a 6 million vertex sequence there may be 150,000 visible onscreen at anyone time. Does gluPerspective() handle this automatically , or do I need to cull manually? If so what is the best way to cull when you’re using display lists? I’ve read that display lists are much faster than vertex arrays for nvidia, is this true?

Any and all help the experts out there can give would be appreciated.

thanks,
J0ey4

OpenGL will clip the vertices, but only after they have been transformed. That means you have lots of vertices, taking up bandwidth, that never get drawn. You need to cull most of the unseen vertices yourself to speed it up. For NVIDIA hardware look into using NVIDIA’s VAR extension and abandon display lists.

[This message has been edited by DFrey (edited 03-15-2002).]

The geforces are not that fast at rendering lines. I suggest you get a wildcat 4210 or above, as they deal with lines in a better way (it’s just about all they do better than gf3).
6 million verts, eh? I don’t think you’ll get really smooth frame rates with that.
Culling has to be done by the application, the gpu still transforms the vertices, and clips to the viewport, even if those vertices are outside the viewport.
Anyway, you cannot touch (read or write) the contents of a display list once its created.
If you’re optimising for the gf3, then use the vertex array range extension, which will allow you to upload your vertices to agp or video memory, which will give you better performance than display lists - but as I say, the line drawing is slow on geforces…so it probably won’t help you as much as you want.

The biggest difference between GeForce and Quadro hardware is their handling of line drawing. The Quadros at a similar level are much faster, I’m told.

I’d second the suggestion for trying 3dlabs hardware, too.

Maybe you guys are thinking of AA lines, where AA is done in hardware on 3dlabs. Im guessing the gf3 and gf4 is still software. Otherwise the algorithm (Blinn is it?) should be similar.

Another thing… culling is for polygons. glEnable(GL_CULL) will not help. If a line is outside the view, then its easy to cull. If it’s behind other lines, how will you cull that???

V-man

We had that a couple of weeks back I think. Consumer hardware renders lines as thin triangles which increases geometry load at least 6x and hogs you with all the triangle setup overhead. That’s regardless of anti-aliasing.

PS: It’s Bresenham’s line algo

Judging from the posts, I guess I have two options:

  1. Buy better hardware.
  2. Application doing the “culling”

Each vertex is a set increment further down the z-axis, say 0.01f than the previous one. Therefore each set of 15000 vertices in a display list has a uniform length down the Z-Axis.

Would it help to put a bounding box with a length of 15000 vertices on each list and do some frustrum culling so I only call the lists whose bounding boxes are partly visible on the screen. That way the card wouldn’t have to send all the vertices in the pipeline and then transform them all before deciding not to draw them, right?

I will try to do that tomorrow, and I will look into “NVIDIA’s VAR extension and abandon display lists”. Please let me know if the above sounds like it might help.

This stuff helps a lot guys, thanks!

J0ey4

Originally posted by zeckensack:
Consumer hardware renders lines as thin triangles which increases geometry load at least 6x and hogs you with all the triangle setup overhead.

That’s not actually true; drawing lines as triangles is simply one method of doing so, and even then at worst you’ve turned 2 vertices into 4 – I don’t see how that creates 6x overhead.

Maybe such a claim could make sense in the framework of polygon mode, where one triangle (3 vertices) becomes 6 triangles (24 vertices), but then again naive polygon mode apps draw every line twice – a good reason to avoid polygon mode even if the HW supports it.

You should have no trouble getting non-AA lines to run at a decent speed on a GeForce3. Obviously, though, this sounds like a lot of lines.

If you want 60 fps with 6 million lines (all going through OGL, not culled in the app), you probably aren’t going to get it.

  • Matt

Yes Joey, what you’re thinking of doing sounds correct.
Matt, I’m sure this has been covered in that “why are lines so slow of gf hardware” topic, but why is the quadro so much faster than the geforce at lines? And why does doing a bit of soldering on the geforce make its lines as fast as the quadro?

Originally posted by knackered:
And why does doing a bit of soldering on the geforce make its lines as fast as the quadro?

Probably because they are actually the exact same chip ! The only difference being that on a Quadro-based card there is an additional wire to tell the chip to act as a Quadro instead of a GeForce !!! (might not be that though …).

Regards.

Eric

[This message has been edited by Eric (edited 03-15-2002).]

Yes, Eric, I’m aware of that, I was just being provocative!

Originally posted by mcraighead:
That’s not actually true; drawing lines as triangles is simply one method of doing so, and even then at worst you’ve turned 2 vertices into 4 – I don’t see how that creates 6x overhead.

Sorry, must have been talking out of my ass again. I was probably thinking ‘one line segment -> two triangles -> six vertices’ which is of course bogus, as you pointed out.

I’ll think a bit more in the future

Just wanted to pop back in and say that the proposed frustum culling worked perfectly. I divided my line_strip into display lists of 15-30k verts apiece and enclosed each one in a bounding sphere, then I test each sphere and only call the lists in the viewing volume. Camera movement is perfectly smooth.

Thanks for the help guys!!

J0ey4

Culling is a very good thing. Another thing that might help is increasing your AGP aperture size in the bios. This will let the graphics card use more system memory for direct display list storage.

[This message has been edited by dorbie (edited 05-03-2002).]