Don’t map/unmap frequently
Adding to this, don’t unmap at all. Use persistent mapped buffers and employ proper buffer object streaming techniques.
the objects change dynamically and very frequently, it’s hard to predict a vertex buffer that fits all objects, so I sometimes need to render the object in multiple iterations to fit the buffer size with additional map/unmap’s.
For this kind of rendering scenario, I would suggest imposing slightly on the outside world here.
The big problem with
glBegin/End is that this API doesn’t provide enough information for you to know how much storage the vertex data in the
glBegin/End pair will actually need. The format of vertex data is not provided, and there is no indication of how many vertices they will provide. Both of these pieces of information are critical.
The thing is, most people using
glBegin/End already know how many vertices they’re about to send (or at least, it’s easy for them to figure it out). And their vertex data format is hard-coded into which
glVertex/TexCoord/etc functions they call. So in both cases, the information is known by the caller.
So simply require that the caller provide that information to you.
Your equivalent to
glBegin ought to take a vertex format descriptor (of some kind) and a vertex count. That way, you can compute how many bytes of data you will need.
Given that information, what you need next is a ring of buffer objects of known allocation size. If the number of bytes exceeds the size of the buffer you currently have “open”, then you “close” that buffer (if you’re not using coherent mapping, this is where you flush the buffer), pull the next one out of the ring, “open” it up and start writing to it. If the next buffer is still in use from previous rendering commands (verified via a fence placed when the buffer is “closed”)… then you’re out of luck and have to stall the CPU until the GPU is done with that buffer.
How many buffers you need in the ring and what sizes to use? That’s up to you. You may even have multiple rings for different use cases. GUI rendering might only need to use 2x8MB buffers, while serious vertex rendering could use 4x 128MB buffers.