Question about queue families

Hi guys, hope you’re doing well. I am new to Vulkan and I wanted to confirm something regarding queue families

I understand that a physical device will have 1 or more queue families. Each family can handle specific operations. My device’s queue family at index 0 has the following flags:

  • Graphics
  • Transfer
  • Compute
  • Sparse Binding

Does this mean that the queues I create within this particular family can handle all sort of operations and not just graphics operations? Also does it make sense to create multiple queues within the same family in an attempt to speed up performance?

That is classic allqueue that will accept any kind of operation. IIRC, then GPU has to have at least one Graphics+Compute queue. And Graphics always implies it can do Transfer (even if it is not advertised). So you are always guaranteed to be offered this queue (plus minus Sparse), unless it is a compute-only device.

I don’t think it currently makes sense to create multiple queues of the same family for performance reasons. Nor are you often even offered one, except on NVidia.

What makes sense is to use explicit async Transfer queue (i.e. has only Transfer, and maybe Sparce but no Graphics or Compute flags) for CPU⇔GPU transfers (but not GPU⇔GPU copies).

Additionally it is possible to squeeze some free performance from running compute next to certain graphics jobs with Compute-only queue. Search for "async compute’.

AMD GPUs offer 2 transfer queues in the one transfer queue family. Since they advertise their GPUs as having “dual transfer” capability, I assume the two queues represents separate pieces of hardware.

AMD GPUs offer 2 transfer queues in the one transfer queue family.

I am aware that they have two DMA cores. I would welcome any information on it though. I kinda pre-assume one queue can saturate them both (same as graphics\compute cores). It comes to Vulkan queue system design shortcomings: KhronosGroup/Vulkan-Docs#/569. I guess it requires some interpretation by the vendors, which is not always easily available. And even so in that case it is annoying that some of it is not available programmatically. In my previous comment I shared the most I know. I largely refer to this AMD material.

Thank you for your response, it makes a lot more sense now!