Issue using the result of one RenderPass in another RenderPass

Multiple render passes are pretty traditional and I thought I’d play around with that in Vulkan given the extra setup related to the VkRenderPass object etc. I am running into an issue which looks to be synchronization related.

The artificial scenario:

  • Render Pass A: Render the scene and just write a solid color to each fragment
  • Render Pass B: Read result of A (using a sampler) and multiply with texture for final present output

Here’s the result I get:

Screen Shot 2021-07-10 at 11.51.02 AM

First image shows where both render passes set clear color to black. Second image RenderPass_A has set clear color to green and writing cyan to fragments, and RenderPass_B has clear color red. So this kind of tells me that RenderPass_A haven’t finished writing before B is reading the result.

Both render passes use the same vertex and index buffer.

The command buffer recording looks like (semi-pseudo for brevity):

void BuildCommandBuffer()
{
    vkBeginCommandBuffer(cmdBuf);

    // RenderPass_A
    vkCmdBeginRenderPass(cmdBuf, &RenderPass_A, VK_SUBPASS_CONTENTS_INLINE);
    {
        vkCmdBindPipeline(cmdBuf, ..., RenderPassA.Pipeline);
        vkCmdBindVertexBuffers( ... );
        vkCmdBindIndexBuffer( ... );
        vkCmdBindDescriptorSets( ... , RenderPass_A.PipelineLayout, ..., RenderPass_A.DescriptorSet);
        vkCmdPushConstants( ... );
        vkCmdDrawIndexed( ... );
    }
    vkCmdEndRenderPass(cmdBuf);

    // RenderPass_B
    vkCmdBeginRenderPass( ..., RenderPass_B, ...);
    {
        vkCmdBindPipeline(... RenderPass_B.Pipeline);
        vkCmdBindVertexBuffers( ... );
        vkCmdBindIndexBuffers( ... );
        vkCmdBindDescriptorSets( ... );
        vkCmdPushConstants( ... );
        vkCmdDrawIndex( ... )
    }
    vkCmdEndRenderPass( ... );
 
    vkEndCommandBuffer(cmdBuf);
}

RenderPass_A’s VkRenderPass:

VkAttachmentDescription colorAttachment = {};
colorAttachment.format = Format;
colorAttachment.samples = VK_SAMPLE_COUNT_1_BIT;
colorAttachment.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
colorAttachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
colorAttachment.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
colorAttachment.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
colorAttachment.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
colorAttachment.finalLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;

VkSubpassDependency subpassDependency = {};
subpassDependency.srcSubpass = VK_SUBPASS_EXTERNAL;
subpassDependency.dstSubpass = 0;
subpassDependency.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpassDependency.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpassDependency.srcAccessMask = 0;
subpassDependency.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;

RenderPass_B’s VkRenderPass:

VkAttachmentDescription colorAttachment = {};
colorAttachment.format = Format;
colorAttachment.samples = VK_SAMPLE_COUNT_1_BIT;
colorAttachment.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
colorAttachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
colorAttachment.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
colorAttachment.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
colorAttachment.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
colorAttachment.finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR;

VkSubpassDependency subpassDependency = {};
subpassDependency.srcSubpass = VK_SUBPASS_EXTERNAL;
subpassDependency.dstSubpass = 0;
subpassDependency.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpassDependency.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpassDependency.srcAccessMask = 0;
subpassDependency.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;

The VkImage written to by A and read to by B is set to:

initialLayout = VK_IMAGE_LAYOUT_UNDEFINED
usage = VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;

My guess is that I haven’t gotten the Subpass dependencies correct. I have tried a “variety” of different stageMask and accessMasks - including FRAGMENT_BIT etc. I have also tried an explicit vkCmdPipelineBarrier between the two render passes; but I always get the result above.

There are no validation warnings or errors.

Any pointers? :slight_smile:

Do, or do not. There is no try. Anyway let’s see…

A’s VkSubpassDependency is irrelevant here to our problem, because it is srcSubpass = VK_SUBPASS_EXTERNAL, and so does not care what happens afterwards of the render pass instance.

B’s VkSubpassDependency is missing srcAccessMask, so it is practically defunct to cover anything A does. Also you say you are sampling the image, but you seem to use incorrect dst* flags for that. For sampled image, those would be dstStage = in whichever shader you sample and dstAccess = VK_ACCESS_SHADER_READ_BIT.

Hey Kr0oze,

Thanks for your input! Yes, those were some of the dependencies for B that I tried.
For instance:

Dependency on A’s color attachment output, B sample in fragment shader.

subpassDependency.srcSubpass = VK_SUBPASS_EXTERNAL;
subpassDependency.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpassDependency.srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
    
subpassDependency.dstSubpass = 0;
subpassDependency.dstStageMask = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT;
subpassDependency.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;

or, dependency on A’s fragment write, sample in B’s fragment shader.

subpassDependency.srcSubpass = VK_SUBPASS_EXTERNAL;
subpassDependency.srcStageMask = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT;
subpassDependency.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
    
subpassDependency.dstSubpass = 0;
subpassDependency.dstStageMask = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT;
subpassDependency.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;

Basically they all give the same result. Which may indicate that the issue is not (only) the dependencies. Hm.

First, I should make the obligatory rant. Trial-and-error is pretty bad for software engineering. And especially low-level and synchronization. Be warned that even if it looks like it works, it does not mean your code is correct. While it is the “harder” way, you should try learning by understanding the system, not tacitly by brute-forcing the driver until it looks like it yields to your intent.

</rant>


From your written descriptions I am not sufficiently precisely clear what happens and what is supposed to happen. How many images there are. How are they all treated by both A and B. And what result do you expect to see?

For debugging purposes you can largely “opt-out” of synchronization system by copiously using vkDeviceWaitIdle and brutal_barriers

void brutal_barrier( const VkCommandBuffer  command_buffer ){
	const VkPipelineStageFlags src_stage = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
	const VkPipelineStageFlags dst_stage = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
	const VkMemoryBarrier mem_barrier = {
		.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER,
		.pNext = nullptr,
		.srcAccessMask = VK_ACCESS_MEMORY_READ_BIT | VK_ACCESS_MEMORY_WRITE_BIT,
		.dstAccessMask = VK_ACCESS_MEMORY_READ_BIT | VK_ACCESS_MEMORY_WRITE_BIT,
	};
	vkCmdPipelineBarrier( command_buffer, src_stage,  dst_stage, 0, 1, &mem_barrier, 0, nullptr, 0, nullptr );
}

You could also gimme the whole VK_LAYER_LUNARG_api_dump, and I should be able to spot the problem.