Need help / information with how memory can be allocated or deallocated at runtime. So, the case being creating or destroying objects or players at runtime.
The vertex and index buffer can be ‘prepared’ at initialization with the vertex and index data, and as such will not change at all during my game.
The transform data, however would be dynamic, depending on how many objects / players are spawned in the scene. So i have currently used a dynamic uniform buffer with transform data for four objects, known at startup time. How can i extend this to incorporate dynamic transform data.
Is this possible. I would ideally like to avoid allocating a huge block of memory without knowing how much of it would be actually used. Even if I did allocate, e.g. 1GB and during runtime, the game required one transform block more than I what I allocated. How would i do it effectively.
But this is basically what you should do. Figure out up-front how many of these things you want around (and therefore how much space they should take up), allocate that amount of space (probably double buffered to avoid unneeded synchronization), and use that space as needed.
Really, the double-buffering is the most important part of this, as you never want to have the CPU wait to insert the next frame’s data for the GPU to finish executing the previous frame.
Double buffering, in this context, is about the fact that the GPU is rendering asynchronously relative to the CPU. That is, when the CPU is about to begin writing the commands for frame 2, the GPU has not yet finished executing the commands from frame 1. This is a good thing, but it has side effects that have to be taken into account.
Consider the memory buffers that store per-object state that potentially changes every frame, that the model-to-camera matrix for each object. When you go to build commands for frame 2, you’re also writing the model-to-camera data to the appropriate place in the buffer. Well, since the GPU is still reading some of that data for frame 1, you have to do you writing in a way that it doesn’t overwrite anything outstanding from frame 1.
So you double-buffer this data. That is, you have two blocks of device memory (or one block of device memory twice as big). You use one block on one frame, then use the second block on the second frame, and on the third frame you go back to block one (you use vkWaitFences to block the CPU on frame 1 before trying to write to block 1; the GPU ought to be done with that frame by that point, so it should be quick).
This has nothing to do with any staging that you might need in order to perform an upload to a particular memory object.
That depends entirely on the needs of your application.