Multi-Threading Vertex Buffers?

I have managed to multithread everything in my voxel engine rasterizer besides two vkCmdCopyBuffer() calls that constantly update a single massive vertex and index buffer using an array of thousands of VKBufferCopy regions, and unfortunately those two calls are very expensive. I cannot draw from these buffers and manipulate them at the same time (right?), so my idea is to have two sets of buffers to switch between so I can manipulate one on another thread while the other is being drawn on the main thread.

This halves the amount of vertices I can render for the terrain at once but I’m not too worried about that, so is it an ok idea? And is there a better one or maybe something about the API I’m missing? Thank you!

I’m surprised no one wanted to give me advice on this. Yes, it’s an ok idea and I can confirm it works.

It’s important to recognize that Vulkan is a tool. It doesn’t decide for you what you ought to do. It presents hardware capabilities and offers you the ability to define solutions that fits within those capabilities.

For any particular use case, there is a solution. But that solution will vary depending on the details in question.

Fully double-buffering an entire terrain map of voxels might be reasonable. Alternatively, you could achieve a similar effect by grouping your voxels into fixed-sized blocks and allocation storage for more blocks than you could use at any one time, thus allowing you to modify an unused block and swap that in the next time you render.

But these are appropriate for different use cases. The block-based one is very memory efficient… if the number of blocks that get changed each frame is small. If you were modeling waves with your voxels, you’d effectively need to do double-buffering, since all the blocks are changing each frame.

But also if you’re changing blocks from different thread, the block-based solution is less thread-friendly. This is because your block allocator (the code that keeps track of which blocks are in use and which are not) is a global resource. If different threads need to allocate a block, then that block allocation needs to be guarded by a mutex (note: clever programming might get around this; I haven’t investigated this significantly). And while there are good mutexes for cases of low contention, it’s still a cost that’s higher than bumping an atomic counter to get the offset for a double-buffer scenario.

Basically, only you know enough details about your problem to know how to solve it efficiently.

1 Like