I’m looking for the best way to render several millions of cubes. I have for example 2millions cubes, differents sizes but all aligned in the same way.
My solutions for now is:
- Compute normal vectors for each face (front, back, top, etc…).
- Loading all data in 6 VAO/VBOs, one by face (one for each front face of all cubes, one for each back face, etc…). For each cube, I load 4vertices (3float per vertex) in each VBO.
- Drawing each VBO with
glDrawArrays(GL_QUADS, 0, 4*nbCubes);
. My normal for this face is set before as a uniform in my vertex shader.
Is there a better way to do that?
I tried to use one VBO with only 8vertices (instead of 6*4=24 in my solution) and using glDrawElements but I can’t store indices for each cube. And millions calls to glDrawElements doesn’t seem good.
I have already increased my number of VBOs in order to get VBO of 4Mb.
What techniques can I use if I can’t load everything in the GPU?
Use geometry instancing so that you have to store only a single cube’s geometry and just use another buffer to store the transformation matrices. See ARB_draw_instanced and ARB_instanced_arrays.
I wonder what purpose these “million cubes” and “million spheres” projects serve. These topics regularly spring up. A GL fetish maybe?
Million cubes often translates to “minecraft”. For minecraft like rendering: Don’t forget to skip all cubes which have opaque blocks on all sides. For typical maps this reduces millions to thousands.
i wrote this for just such an occasion
Thanks for the replies.
My context is not a “minecraft” like project. It’s for a scientific visualization software. I don’t have a 3D grid so I can’t skip cubes with opaque blocks on all sides. I just have a set of cuboid (height, width, depth can be different) with different sizes.
I am not sure if geometry instancing is usable in this case (different sizes).
Thanks zeoverlord for your link. Geometry shader seems to be a good idea. I tried with a VBO loaded with for each cuboid its center location and its size (h, w, d), and drawn with GL_POINTS. The vertex shader just transfers these data and the geometry shader generates the associated triangle strip. I just have to figure out how to set normals properly.
Geometry instancing is usable also in case if you have different sizes for the height, weight, depth, just you have to include the scaling in the model transform matrices of the instances.
Geometry shader based solution can be as good as well, except that it may be slightly slower due to the additional geometry shader stage. If you can use it, you can accelerate things a bit by using an instanced geometry shader with 6 invocations so that you can emit the sides of the cuboid in parallel.
Thanks a lot agnuep!
I tried geometry instancing with uniform buffer objects and it’s a lot better!
I have to implement some culling techniques now. Is there a best solution?
And just to know, what do you mean by “instanced geometry shader with 6 invocations”? Any link about that?
What I suggest is to do view frustum culling with a geometry shader .
About instanced geometry shaders, you may check out the GL4 spec or the ARB_gpu_shader5 extension.
And if someone was to render like 20 million cubes, but all of them are of identical size and the only difference would be colour (2 variants)? They would all be fitted in a 3D grid, side by side. The PC I will be working with might have an integrated graphics card. The display wouldn’t have to be refreshed too often. New cubes will appear rarely in the grid filling it very slowly and some of old, existing ones will be removed from time to time.
What would be the best solution for this? The first thing I thought about was display list but I also heard about VBOs and FBOs.
I’m trying to render something like this: http://www.leokrut.com/leocrystal7.gif
Almost textbook case for geometry instancing, with ARB_instanced_arrays or ARB_draw_instanced.
If you need to change the colors, with the former, just specify a new color vertex attribute array and kick off another glDrawElementsInstanced().
If you never need to change any vertex attributes, then either use geometry instancing and/or display list. Display list is only good if you don’t need to change the batch (vertex attributes, index list, etc.) as they take a while to compile, but can accelerate rendering batches pretty much any way you want to specify them (immediate mode, client arrays, VBOs, etc.)
20 million cubes are tough without culling for even new GPUs. As I see in your case it is more of a “minecraft” style scene.
What I would suggest is to organize your data into an octree (whatever octree variant you choose it will be okay) then do culling based on it.
For rendering, you can use either VBOs or display lists, both are fast (though the later is considered deprecated in the newer versions of OpenGL).
What I also want to note here is that you don’t necessarily have to have separate items in your octree for all cubes. Let’s say you should group e.g. cubes lying in the same 50x50 cube (that should be at most 2500*12=30000 triangles), put them into one display list or draw call in case you use VBOs and perform the culling on this granularity.
You should also take tksuoran’s advice:
Don’t forget to skip all cubes which have opaque blocks on all sides. For typical maps this reduces millions to thousands.