Memory-barriers for copying between buffers

LeonFretter · September 5, 2021, 6:23am

Let’s say I have two buffers. One is a vertex-buffer, one is a shared staging-buffer (i.e. not only used for copying to the vertex-buffer).
For doing a copy my first approach was to insert one pipeline-barrier before the copy-commands, and one after. But now I am thinking: It is probably not valid to coalesce the access- and stage-masks for the staging-buffer and the vertex-buffer together, since there might be access- and stage-masks in there which are such that the access does not happen in the stage. Am I correct that in this scenario I need two pipeline-barriers before the copy, and two after?
Like this (pseudo-code):
vkCmdPipelineBarrier sync_staging_before; vkCmdPipelineBarrier sync_vertex_before; vkCmdCopyBuffer copy; vkCmdPipelineBarrier sync_vertex_after; vkCmdPipelnineBarrier sync_staging_after;
Does this make sense?

edit: Taking a closer look at the spec, I see that for any member of AccessMask, the stageMask only has to include one or more members that support the given access. So it would be valid to coalesce it all into one pipeline-barrier. Which version would you prefer?

johannesugb · September 5, 2021, 10:21am

Are you using buffer memory barriers, i.e. tied to specific buffers using a bufferMemoryBarrierCount > 0 and pBufferMemoryBarriers != nullptr in vkCmdPipelineBarrier, or are you not using buffer memory barriers but instead global memory barriers?

Only in the former case two memory barriers might make sense. In the latter case, you should only issue one barrier and handle the desired synchronization via or-ing the appropriate stage flags and access masks.

krOoze · September 5, 2021, 11:14am

The access mask does not have to match all the stage masks.

But if it bothers you, you could always use synchronization2.

LeonFretter · September 5, 2021, 11:18am

Thanks, that makes sense, the global memory-barrier is able to “join” these requirements together as it applies to all memory,whereas the buffer-memory-barriers only apply to a specific buffer, and thus, multiple are required. For the moment I make my life easier by only using the global memory-barriers, partially motivated by TheMaister’s synchronization blog-post, but also because I am still far from the stage where I’ll begin tweaking things for max. performance.

LeonFretter · September 5, 2021, 11:20am

Thanks for that remark. This is actually the first time I hear about the synchronization2. “Simplifying synchronization” sounds good to me I’ll have to dig into that!

krOoze · September 5, 2021, 12:18pm

The claim of synchronization simplification is greatly exagerated. Nevertheless the extension does exactly what you ask for here. In the extension the stage flags are inside the individual memory barriers, so it is more natural to see what buffer is synced against what access.