I am pondering over a fast way to do a check for a value on the gpu.
Since cpu-gpu back and forth data transfers are expensive, I was wondering what is the fastest way for the cpu side application to check a value that is computed by/within a shader.
I have a shader that computes a value. I would like the next cycle of the same shader to be run only if the value is above a certain threshold. But the check (whether to run or not to run) happens on the cpu side i.e running the shader program.
Any thoughts on what’s the fastest way to do this?
“Hang 'em all and let God sort 'em out”.
In this case “hang 'em all” = “run it anyway”, and “God” = “the GPU”. Thinking about “how will I do this on the CPU?” is the wrong kind of thinking; the value is on the GPU so keep it there and let the GPU do the work instead.
So, you can set up either an alpha blend, an alpha test or a stencil test (or even a logic op; the best to use depends on the data type and range of your value, to be honest) to only accept fragments that are above your threshold, then just run it normally.
Yes, it’s more operations (including some state changes) but my money is still on it being cheaper than any kind of readback.
It’s worth noting that this is exactly the kind of thinking that’s behind most shadow implementations, where areas that are in shadow are skipped based on a condition that’s evaluated on the GPU, but are otherwise drawn normally.
Alternatively it may be possible to (ab)use transform feedback and glDrawTransformFeedback, occlusion queries and glBeginConditionalRender, or glDrawElementsIndirect (or other *Indirect() draw call) for this. AFAIUI these are all designed to provide some control over what gets rendered based on values stored on the GPU.