What happens if I synchronize with a pipeline stage that doesn't exist? And what about BOTTOM_OF_PIPE?

What if I specify an unused pipeline stage in a barrier/external subpass dependency?

Let’s assume I synchronize two shaders (either via pipeline barrier or external subpass dependency, which should not make a difference for the sake of this example) using the following stages masks:

srcStageMask = VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT,
dstStageMask = VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,

What happens if the pipeline used in the first command does NOT include tessellation shaders (but only vertex+fragment shaders)? Would no synchronization happen at all because there are no active commands in the pipeline that have tessellation shader stages?

Wouldn’t it also be a viable strategy to execute all previous commands up to stages “less than or equal to tessellation evaluation shader stages” before establishing the barrier?

If I’m not mistaken, I would have to specify

srcStageMask = VK_PIPELINE_STAGE_VERTEX_SHADER_BIT | VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

for my example to work (assuming that there could be shaders consisting of vert+frag only, but also vert+tesc+tese+frag before the barrier), right? Or are there other options as well (besides from VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT or VK_PIPELINE_STAGE_ALL_COMMANDS_BIT)?

What about BOTTOM_OF_PIPE?

Furthermore, I am a bit confused about VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT. The specification states that it can not be used with memory access flags because there is no memory access associated with these stages.

When defining a memory dependency, using only VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT or VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT would never make any accesses available and/or visible because these stages do not access memory.

But what about execution dependencies? Could I synchronize only the execution of my example from above with the following stages masks?

srcStageMask = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
dstStageMask = VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,

The specifications state:

VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT specifies the final stage in the pipeline where operations generated by all commands complete execution.

So that sounds like it could be used for an execution barrier. However, I am a bit confused about that “all commands”. Doesn’t it refer to each specific command’s pipeline run-through, i.e. until each command has completely finished its work?

In general, I’d like to know: Is there any meaningful way, VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT can be used with the srcStageMask? I’ve often seen it being used in the dstStageMask, meaning that nothing shall be synchronized, but what about srcStageMask?

Stage masks implicitly include logically-earlier pipeline stages. That means the execution dependency with tess flag still covers vertex stage. With the exception of memory dependency, which require the stage to be listed explicitly. I.e. Storage Buffer writes in the vertex shader might not be made visible to your dstStageMask if you omit the stage flag.

Yea, this works.

But best strategy would be to use the appropriate stage flags that are as tight as possible. Who else is supposed to know what the previous work has done than you?

What about *_OF_PIPE?

Those are just handle stages. There are only there so “after everything” and “before everything” is expressible. Those stages do nothing and are only needed for the API to be able to express certain situations, because of how the srcStage and dstStage parameters are defined.

srcStage = TOP_OF_PIPE means there is nothing in the source synchronization scope. Typically used when that half of the dependency will be handled by a Semaphore wait instead.
srcStage = BOTTOM_OF_PIPE means something like srcStage = ALL_COMMANDS. Prefer the later unless you need specifically BOTTOM_OF_PIPE for execution dependency chaining.
dstStage = TOP_OF_PIPE means similarly to srcStage = BOTTOM_OF_PIPE and equally should be avoided.
dstStage = BOTTOM_OF_PIPE means there is nothing in the destination sync scope. Typically used when that half of the dependency will be handled by a Semaphore signal instead.

Yes, you can form an execution dependency chain with it. Might be useful in some cases. Especially if you want to coerce some unwieldy engine to do what you want. E.g.

vkCmdPipelineBarrier(srcStage=WHATEVER, dstStage=BOTTOM);
vkCmdPipelineBarrier(srcStage=BOTTOM, dstStage=COOL_STAGE);

equals to vkCmdPipelineBarrier(srcStage=WHATEVER, dstStage=COOL_STAGE);

Wow, that answer is super-helpful and makes many things clear. Thank you!
Just some small follow-up questions:

Is this stated anywhere in the specifications?

Of course, in theory, I always should exactly know what comes before. Sometimes, there is the need for some sort of re-usable components where correctness is more important than efficiency. But sure, in an optimal world, those components would be configurable so that they had as tight synchronization as possible.

The stages “do nothing” in the sense that no actual GPU-operations are performed — is that what you mean by “do nothing”?

Because in the sense of an execution barrier (leaving memory dependency aside), those stages can be effectively used to do some synchronization, right? That means,

vkCmdPipelineBarrier(srcStage=TOP_OF_PIPE, dstStage=BOTTOM_OF_PIPE);

is equivalent to

vkCmdPipelineBarrier(srcStage=ALL_COMMANDS, dstStage=ALL_COMMANDS);

in terms of an execution dependency. Is that correct?

Could I also use

vkCmdPipelineBarrier(srcStage=WHATEVER, dstStage=TOP_OF_PIPE);
vkCmdPipelineBarrier(srcStage=TOP_OF_PIPE, dstStage=COOL_STAGE);

to get the same effect? Or is there any reason why BOTTOM_OF_PIPE should be preferred?

And last question (which somewhat completes the circle to my initial question): What happens if I use a stage which doesn’t exist here? Like follows:

vkCmdPipelineBarrier(srcStage=WHATEVER, dstStage=TESSELLATION_EVALUATION);
vkCmdPipelineBarrier(srcStage=TESSELLATION_EVALUATION, dstStage=COOL_STAGE);

(assuming that there is no tessellation shader used between the two barriers)
Would that still equal to vkCmdPipelineBarrier(srcStage=WHATEVER, dstStage=COOL_STAGE); or does it not synchronize anything because there is no command in the queue that uses the TESSELLATION_EVALUATION stage?

Spec 6.1.2. Pipeline Stages:

Pipeline stages that execute as a result of a command logically complete execution in a specific order, such that completion of a logically later pipeline stage must not happen-before completion of a logically earlier stage. This means that including any stage in the source stage mask for a particular synchronization command also implies that any logically earlier stages are included in AS for that command.

Similarly, initiation of a logically earlier pipeline stage must not happen-after initiation of a logically later pipeline stage. Including any given stage in the destination stage mask for a particular synchronization command also implies that any logically later stages are included in BS for that command.

Earlier in the chapter in a Note:

Note

Including a particular pipeline stage in the first synchronization scope of a command implicitly includes logically earlier pipeline stages in the synchronization scope. Similarly, the second synchronization scope includes logically later pipeline stages.

The caveat follows:

However, note that access scopes are not affected in this way - only the precise stages specified are considered part of each access scope.

Yes, they are only formal stages. They are not real. They execute nothing. They have no memory accesses. Their only reason for existence is so you can express in the API “before anything starts” and “after everything finishes”. They are like the black circle and encircled black circle in the UML Activity diagram.

They can be used for dependency chaining as I demonstrated before.

No, vkCmdPipelineBarrier(srcStage=TOP_OF_PIPE, dstStage=BOTTOM_OF_PIPE) means “synchronize nothing with nothing”.

vkCmdPipelineBarrier(srcStage=BOTTOM_OF_PIPE, dstStage=TOP_OF_PIPE); equals to ALL_COMMANDS. (Because src stage implicitly includes every logically-earlier stage as we have established. Every stage is logically-earlier to BOTTOM_OF_PIPE. And dst implicitly includes every logically-later stages. Every stage is logically-later to TOP_OF_PIPE).

Still it is harder to read, so you should prefer ALL_COMMANDS. And it is not exactly equivalent, as we have established the memory accesses need the stages stated explicitly.

Yes. The chaining stage chosen should not matter as long as the barriers are back-to-back like this. But if you added later something irrelevant between those two barriers, dstStage=BOTTOM_OF_PIPE would be better because it would allow that work to proceed in the meantime. So because it does not matter, it is better to just have BOTTOM_OF_PIPE there from the start.

Then again one should avoid doing something like this at all. Two barriers are probably worse than one barrier. And driver in Vulkan is discouraged from being too smart, so it cannot be expected this gets optimized away.

It forms an execution dependency chain. It does not matter at all if a stage “exists” or whether there are queue operations performing relevant stuff in between. A dependency chain is always formed as long as the stage parameter in the first dependency is a subset of the stage parameter in the second dependency.

Ah, sorry for the confusion! vkCmdPipelineBarrier(srcStage=BOTTOM_OF_PIPE, dstStage=TOP_OF_PIPE); is what i actually wanted to ask about, not what I actually asked. But you gave the answer to my intended question anyways, so thanks a lot! :slight_smile: This is really helpful. I think, everything is clear now.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.