[QUOTE=tokongs;1287306]Okay so I have spent I don’t even know how many hours now trying to read about and experiment with VAOs/VBOs and I just can’t wrap my head around it.
Are you supposed to have multiple VAOs/VBOs?
Say I want to load 5 different 3D models. Do I store all of vertices in the same VBO, or do I create a VBO for each of the objects I want to draw? Do I need a VAO for each VBO or do I use the same one throughout my program?[/QUOTE]
Good questions. You’ve already gotten some good responses, but it may help your understanding to talk about why each exists.
VBOs are of course the containers for batch data (vertex attribute and/or index lists). However, in unextended OpenGL, when you’re rendering batches (glDraw*) there’s a validation cost to switching buffer objects and accessing data in buffer objects.
One of the speed-ups driver implementers came up with for this was vertex array objects (VAOs). This allows the driver to cache information on reused vertex attribute and index list bindings, which can save some of that cost.
Another speed-up one driver implementer came up with for this was bindless. This allows the app to perform some of that per-draw-call validation work up-front once so that your VBOs are hot and ready to render with every time you reference them. And because your VBOs are hot and ready to render with, you can provide GPU addresses for your vertex attribute and index lists to the driver directly, which bypasses needless lookup logic in the driver saving further per-draw-call cost.
So to your question…
If you use bindless (available on NVidia drivers only) as a performance optimization, then for static vertex attribute and index list data, in practice it doesn’t make much difference rendering performance-wise whether you split your data up into a bunch of VBOs or try to merge your vertex attribute/index lists data into a smaller subset of VBOs (e.g. using a Streaming Buffer Object approach).
However, if you don’t use bindless but instead use VAOs as a performance optimization (or if you use neither!), then there is a measurable cost you pay for bouncing around to different buffer objects for your rendering. So you should consider reducing the total number of VBOs accessed in a frame (e.g. by merging VBOs) to minimize this cost.
Here I’m assuming a fixed set of batches (draw calls) in both of these cases. However, it’s worth pointing out that reducing your VBO count by storing more batches per VBO opens up the possibility for you to reduce the number of draw calls you make. Reducing the number of draw calls you make can improve your performance as well.
So what’s the take-away from this?
I’d encourage you to think about ways to combine your batch data into a small set of shared VBOs, rather than creating a VBO per batch. One option to consider which works well for dynamic vertex attribute data (and static for that matter) is a Streaming Buffer Object approach. This effectively uses VBOs as a fast on-GPU cache of the subset of the vertex attribute data you’ve needed to render with lately. It also makes it trivial to bound the amount of GPU memory that you allocate and use for vertex attribute and index lists data.
And if you’re not using bindless to accelerate your draw calls, use VAOs where it makes sense.