Occlusion Query Viability - How much CPU is necessary

Hi,

I would like to cull geometry.

Some time ago, when drawcalls were something one had to issue from CPU, using occlusion queries was a obvious.
Same with frustum culling, CPU descision.

Modern CL allows to keep stuff on GPU, frustum culling can be done via compute and indirect buffer: instanceCount=0
Occlusion Queries are useless as they are “bound to”/ “embrace” the draw call. Hi-Z culling is rather heavy weight but purely GPU.

Now comes vulkan.
Indirect Commands work, the buffer lies on GPU and should be modifiable via Compute (I very much hope so).
Now, it appears that occlusion queries still exist and I have seen a sample that uses them.
But can I use them to do GPU culling, i.e. can I bind the “query buffer” to compute.

Or must I download it to CPU and modify the indirect buffer on the host (mapped or with upload).

There is no such thing as a query buffer. Or a compute buffer. Or an indirect buffer. Or any other kind of buffer. There are just buffers.

A buffer is just a region of memory that you designate to be part of this particular buffer. That’s it. Memory does not care what operation you used it for. So long as you perform the appropriate steps to synchronize activities, a buffer is a buffer is a buffer.

That being said, the result of an occlusion query is not limited to 0 or 1. It’s the number of fragments that pass. If the VK_QUERY_CONTROL_PRECISE_BIT isn’t set, then that number can be any non-zero value if any fragments pass, but it isn’t restricted to outputting 1. As such, copying the query result directly into an indirect rendering command isn’t terribly helpful.

Ok, I was too inaccurate.

What I meant is that I want to bind the indirect command buffer to a compute shader do modifiy e.g. instance count.

I know query objects from GL (not trerribly well but using google, I can use them well enough).

I would like to have a buffer of query results that are directly modified/written to by fragment queries and bind it to a compute shader together with the indirect command buffer, cull via instance count, thus obviating the CPU down/upload.

Sorry if I was to helter-skelter.

What I meant is that I want to bind the indirect command buffer to a compute shader do modifiy e.g. instance count.

I would like to have a buffer of query results that are directly modified/written to by fragment queries and bind it to a compute shader together with the indirect command buffer, cull via instance count, thus obviating the CPU down/upload.

And the only reason there could be a problem with that is if you believe that an “indirect command buffer” is a thing that exists, which is separate from (presumably) a “shader storage buffer” and a “buffer of query results”. So again, my main point stands: it’s all just memory.

So long as you synchronize things properly, memory is memory.

You are right, I am still thinking ing GL target names. When i say “indirect command buffer” I mean a buffer containg structs of


VkDrawIndexedIndirectCommand {
    uint32_t    indexCount;
    uint32_t    instanceCount;
    uint32_t    firstIndex;
    int32_t     vertexOffset;
    uint32_t    firstInstance;
} VkDrawIndexedIndirectCommand;

So, in light of your statement, that I can use any buffer provided its proper, my actual question is, can I get a buffer of query results bound for reading to a compute shader, and more importantly, should I?

[QUOTE=Christoph;40541]You are right, I am still thinking ing GL target names. When i say “indirect command buffer” I mean a buffer containg structs of


VkDrawIndexedIndirectCommand {
    uint32_t    indexCount;
    uint32_t    instanceCount;
    uint32_t    firstIndex;
    int32_t     vertexOffset;
    uint32_t    firstInstance;
} VkDrawIndexedIndirectCommand;

So, in light of your statement, that I can use any buffer provided its proper, my actual question is, can I get a buffer of query results bound for reading to a compute shader, and more importantly, should I?[/QUOTE]

can you? yes with vkCmdCopyQueryPoolResults which will copy the results to an arbitrary buffer. You will need to synchronize that write with the subsequent compute shader using a transfer->compute barrier.

should you? depends on how slow it will be. This type of computation is something that I would put into a separate queue and only compute every so often and setting things up so there are no false negatives.