my program has to render large amount of particles (up to 40000), which are depth sorted (so the drawing order changes).

Which is the fastest way to draw them? VBO? At the moment, I draw them immediately using quads, will vbo be faster? I simply would like to know if changing the program to use vbo will be worth the effort.


What effort? It would have been quicker to just do it than to post your question here. No, I don’t work for Nike.

do you think so? at least, the program runs fine with immediate OpenGL calls, and I am not sure if it is that easy due to the depth sorting, and also, first I would have to learn how to use vbo

I wonder if it’s really faster if the data has to be loaded to the graphics card memory for every redraw, which I guess has to be done, due to the changing drawing order.

  1. learning VBO is VERY easy

  2. updating vertices with VBO is quite fast (and there is an extra function to write directly to memory, so you need no copy of the data)

  3. if only the order changes (no positions), than you don´t need to update the buffer, at all, you only have to change the order in which glDrawElements gets the vertex-indices

  4. if only the positions change, but no colors or texcoords, then those can be stored in VRAM, and don´t have to be transfered to the gfx card each frame

  5. using immidiate mode is always less efficient, due to a big function-call overhead (certainly you are fillrate limited, therefore this is not noticable at the moment)

  6. VBO is cool


thanks, especially 6. sounds good . my clouds are cool, too, so i guess it will fit.

when you say that there is a special function to write directly to memory, which memory? graphics card memory? and what function is it? any good vbo tutorials? I read one but that was so short that there are still some questions left.

I think I will use them…


I searched several places (google, gamedev, this site) but could not find a good tutorial (or rather, any tutoral) on vbo’s… does anyone have a link?


Originally posted by JanHH:
[b]I searched several places (google, gamedev, this site) but could not find a good tutorial (or rather, any tutoral) on vbo’s… does anyone have a link?


At the end of the specs there are some examples which should help.
But here is a link:
I have not looked into it but most of the delphi3d stuff is good.


thanks… as it seems, the easiest way to approach vbo would be to first change the program to use standard vertex arrays, as they are explained in the red book, and in a second step, change this to vbo, the effort to do this seems quite small!?


Depending on your particle features you might be able to use point sprites which would reduce the number of vertices to send by 75%.

thanks but I cannot use them as the particles gets very large sometimes (if the viewer ist close to the particle).

what I don’t understand is, if you have a buffer for vertex/texcoord/color data, and you want to draw these with indices (which change from time to time, due to depth sorting), how is this done?

I guess it’s not the right idea to have the index array NOT in a vbo?

but if they are in a vbo, this has to be bound (glBindBuffer), but doesn’t this “unbind” the buffer bound before (with the vertex etc. data)? can two buffers be “bound” at the same time?

or would you rather store the index data in the same buffer as the other data?

and there are lots of paticle clouds, surely it is not possible to have buffer objects for all of them at the same time. so I guess one would create and fill a vbo when a cloud comes in sight, use it while the cloud is visible, and when the cloud leaves the viewport, delete it. or just leave it there, trusting in the driver to swap it from graphics memory to agp memory and reloading it when it’s used again?

will it be fast enough to create and delete buffers dynamically?

a lot of questions… sorry



there’s a seperate binding point for indices. glMapBufferARB with GL_VERTEX_BUFFER_ARB target attaches/replaces buffers to the vertex data sources.
For indices you have to use the GL_ELEMENT_BUFFER_ARB target. Problem solved

As for the remaining questions, deleting something only makes sense if you know you won’t need it again.
If you’re soon going to need another buffer with the same vertex layout (though this isn’t even enforced by VBO), why not just reuse an old one?
Creating and deleting buffers is a lot of overhead, and worse, can potentially lead to bad memory fragmentation after a while.

Especially for a particle system where you can usually cap the number of particles alive at any one time without introducing noticeable errors, I’d really suggest creating one big VBO and infinitely reusing it.

the purpose is for rendering clouds, there are about 10.000 - 20.000 of them and each has about 1000-7000 particles. I don’t think they can be in graphics card memory all at once, so there has to be some sort of intelligent management. in fact, the amount of clouds appears to be the biggest problem (culling has to be intelligent, texture management etc.). So I’m searching for a solution to draw them as fast as possible, and I really think it might be that immediate calls are fastest (due to no memory swapping overhead appearing). but I would really like to know what people who are more experienced with vbo than me think about it.

I don’t want to annoy anyone but I would really, really like to know any opinions about this… sorry

You say you’re worried about memory transfers. Don’t forget that immediate mode calls require the vertex data to be transferred as well. After all, the data’s final destination is always the same. Using VBO gives you the most efficient mechanism for performing the memory transfers, though, so you should definitely use it.

Also, with the amount of particles you’re talking about, I hope you’ve looked into higher-level optimizations as well? A lot of work has been done on cloud rendering, often based on impostoring techniques. I can dig up some links to relevant papers if you like. These kinds of approaches will do you a lot more good than VBO, since you’re very likely to end up severely fillrate-limited anyway.

– Tom

Thanks. But I did a lot of these optimizations, for example:

  • impostoring
  • quad tree for culling
  • very fast depth sorting (regarding the particles and the clouds)
  • cloud pre-processing so that they are hollow on the inside (which is invisible anyway) to reduce number of particles
  • not using several small impostor textures, but one large which gets split into tiles to reduce memory swapping

so the last think that comes to (my) mind is vbo. Or do you have any further ideas?

If the amount of clouds and their size (particle count) is not above a certain value, it runs very smooth indeed, but above that, it slows down dramatically.

you might have a look as well:

another “problem” is that there is a lot of stuff to render besides the clouds, the program was nearly “too slow” even without them.

The numbers you posted above (10K-20K clouds, 1K-7K particles each), are those before or after these optimizations? If they’re after, optimize harder

That said, I will repeat what Knackered said here – the four days that have passed since you started this thread should have more than sufficed to just go ahead and implement VBO. Rest assured that it’s most certainly not going to be slower than what you have now. If you’re fill-limited or whatever then it may not be faster either, but it’s certainly not going to hurt.

– Tom

thx believe it or not but I also have other tings to do… and I implemented some of the other things I mentionend in the last days. My fear with vbo ist that because you have to generate and fill a buffer every time a cloud comes into sight, this would eat up all performance gain. And I think this is neccessary because if you keep the buffers, there will be too many of them, and you also cannout use one buffer for all clouds, as a clouds’s vertex data has to stay in the buffer, because if it doesn’t (because of being reused), you have to completely refill it for every impostor update of every cloud, wich I think will be much slower than simply using immediate calls.

And my problem is that I don’t know how to approach this best. My concrete questions are:

  • the situation is: one of many clouds comes into sight. so for this cloud, a vbo is created and used while the cloud is visible, and deleted when the cloud is no longer visible. is this the right way? someone said that this approach of dynamically generating the buffers is too much overhead and would lead to bad performance.

  • but I guess I also cannot create one vbo for each cloud at the same time, as that would lead to ca. 10.000 vbo’s, which seems too much to me.

  • and I guess I cannot use one very very large vbo for all clouds togehter, as this would become too large

  • and I also guess that it makes no sense to create one “cloud sized” vbo and use that for ALL clouds visible, as the complete buffer content would change for each drawing of a cloud, making it pretty useless.

and so my conclusion (and question) is, either one of my assumptions is wroung, or it simply makes no sense to use vbo for this program. but what of this is the case? Am I right, or am I missing something? How would a more experienced OpenGL programmer than I am approach this?

Thanks again,

  1. Then don’t delete the VBOs. In our Drivers we trust!

  2. I wouldn’t say it’s too much until I’ve tried it. If it does turn out to be too much, how about creating one VBO per 10 clouds or something like that? 1000 VBOs should be okay – I’ve personally used 600+ without any problems.

  3. Indeed, if you make a very large one the drivers have to find a single continuous memory block of that size somewhere and that might be difficult.

  4. Yes, but you could also create a few dozen buffers and cycle through them circularly or something.

– Tom