Hey guys!
Sooo, a new day, a new problem. :neutral:
I’ve been porting an OpenCL Kernel to Vulkan. This Kernel does ray-tracing on a huge set of Data (between 1 and 2 GB) and then rendered to the screen. Nothing else.
I have ported this Kernel to Vulkan:
[ul]
[li]Render 2 Triangles as a Quad in Fullscreen[/li][li]Do all the Ray-Tracing in the Fragment Shader (A very large Fragment Shader, >5000 Instructions in the compiled version)[/li]The Shader is pretty much a 1-to-1 port of the working OpenCL Kernel
[li]I use a large Storage Buffer to pass all the Data to the fragment shader[/li][/ul]
Functionally it works in Vulkan. But I am seeing a huge performance difference.
The OpenCL version runs at 20 FPS, whereas the Vulkan version runs at <1 FPS.
I have a few rough Ideas I would like to try, but don’t know how:
[ol]
[li]Can I make the Storage Buffer read-only somehow? I have this feeling that the slow speed might be due to the GPU putting in barriers in case of a write command. But I only need to read anyways[/li][li]Can I manually assign / group how many units work on the Fragment Shader? I know this can be done with Compute Shaders and in OpenCL, so why not here? I tried splitting the rendering area into more triangles, but that seemed to slow things down even further… Might be doing it wrong though[/li][li]I get a VK_ERROR_INVALID_SHADER_NV Error on the vkCreateGraphicsPipeline when my Shader code has too many branches (at least, that is what my testing showed). Any ways arround this? This works fine in OpenCL as well.[/li][/ol]
My last resort would be to use a compute shader instead of the fragment shader, but I would like to avoid this option if not absolutely necessary. (Plus, I don’t even know if the above problems don’t exist there anyways)
Thanks again!