SYCL and Vulkan Compute Shaders

Clothoid · December 29, 2019, 10:04pm

Now that SYCL is becoming more widely supported i took some time and ported our project that is currently based on CUDA to SYCL/OneApi. I wanted to use pointers on device side, so i used the SVM extension from OneApi instead of plain SYCL.

I have to admit that SYCL is really a great step forward for the whole industry, since finally there is an open standard that allows to do heterogeneous programming in a clean way that is based on modern C++. I could never really understand why Khronos put so long the focus on C instead of C++, which was in my opinion one of the main reasons why CUDA has become so strong in the industry.

What directly came to my mind when working with SYCL was: This is exactly how i would like to write also my Compute Shaders in Vulkan. Since modern GPUs are able to run SYCL/CUDA/ROCM which are based on C++, there doesn’t seem to be a HW reason why this should not be possible. So the only reason seems to be that Khronos has decided not to allow this right now - and stay with C for Vulkan shaders.
In my opinion it seems to be a “no brainer” to allow SYCL like C++ support in Vulkan compute shaders.
From what i have understood it seems that the following two points are missing:

Vulkan would need an extension that allows to support the compute SPIR-V model that SYCL/OPENCL internally uses.
Vulkan would then have to be able to run SPIR-V created by a C++ frontend that compiles to SPIR-V.

It seems that CLSPV goes somewhat in that direction (but it seems to focus on C only). But having an open source project here does not seem to be reliable enough to create a large code base that in the long-term would depend on such a technology. In my opionion Khronos should enhance Vulkan so that C++ compute shaders - similar to how it is done with SYCL - can be used.
In my opinion this would be the most dramatic incentive for us to use Vulkan for large projects that combine compute + graphics that i could right now imagine.

A related question is if SYCL should be enhanced in some way to work with Vulkan so that resources like buffers and textures could be shared without overhead for those kind of projects that combine compute + graphics and which do want to benefit from all those great advancements that are going on in the C++ language.

rask · February 13, 2020, 4:32am

I second support for this direction

MathiasMagnus · March 12, 2020, 11:48am

Hi Clothoid,

thanks for sharing your thoughts, and apologies for the belated reply (needed to recover my account, plus there were internal WG discussions around the topic). While being a SYCL work-group member, my answer is purely personal and is not a SYCL work-group statement. What you describe certainly would benefit Vulkan compute and the Khronos ecosystem in general. Before I reflect on your proposal, keep in mind that I personally fully agree with you (and would take the idea even further to allow graphics shaders be formulated in a SYCL-like fashion), but there are a few flaming hoops along the way.

On a technical level, what you describe is doable. On an organizational level it requires close coordination between the graphics and compute departments within companies, which are usually different entities (with their own, sometimes conflicting agenda). SPIR-V for graphics and SPIR-V for compute share the IR representation but have different memory models. Different optimizers are required for the two (which is a significant, non-zero part of the compiler and the resource investment). Current state of SPIR/SPIR-V support in OpenCL compared to OpenCL C online compilation is an indicator to vendors’ stance on having compilers for compute IRs and optimizing them to their respective binaries. (Let’s just say adoption is not universal.)

To your related question (which misses a question mark), SYCL currently does not define a direct way to interop with any graphics layer. A viable (and actually working) way is SYCL-OpenCL-OpenGL interop. (I’ve tried this and love that it’s possible, it’s awesome for scientific visualization, but would really like to get omit manipulating the OpenGL statemachine and have both the host and the shader parts look like SYCL.) SYCL-Next will make OpenCL optional and aims to allow a cleaner mechanism for interoperating with other APIs, trivially the one running under it (if there is one). That way you can interop with Vulkan through the same SYCL-X-Vulkan mechanism, given that X supports Vulkan interop. If a SYCL implementation were to run atop Vulkan, you wouldn’t need to go further.

The value proposition of having SPIR-V for compute run on a Vulkan runtime would need to be very persuasive to make it happen. (Once again, because it’s a non-zero investment, but much of that could be reused elsewhere, say OpenCL. This extension in itself would be something that benefits the value proposition: enabling other parts of the Khronos ecosystem to achieve more.)

Regards,
Máté

Clothoid · March 14, 2020, 4:55pm

Hi Máté,

thank you for taking time and replying in detail to my proposal!

In short, what i would like to have is a way to write my code with pure modern C++, while being able to access GPU texture HW and being able to display results on a device without having to do a roundtrip
(readback from GPU to CPU - and send it back to the GPU graphics API).
CUDA already right now allows me to do this in a very nice way - but with the restriction that i have a vendor lock-in.

Here it seems that there are two paths for Khronos to make this possible with an open standard:

enhance Vulkan compute (or even better any shader) to allow C++ shaders
enhance SYCL to have much better texture support and additionally an interop with a graphics layer, e.g. Vulkan.
Here it seems to me that SYCL-Next seems to go in the right direction, from what you have described!

If both options would be available in a perfect world - i think i would most likely even prefer approach 2.
Also for us it is not always necessary to do the compute part on a GPU - in various cases a modern CPU with vector extensions like AVX512 or SVE can handle compute fast enough.

However, CUDA right now has a clear advantage compared to the currently existing SYCL standard!
It has excellent texture support - with bindless textures that do not require image accessors - and CUDA also provides a lot of texture functionality that mordern GPUs support, but that is missing in SYCL.
And it allows interop with many graphics APIs.

What i noticed when looking into OneAPI beta was that this approach fixes a lot of issues that plain SYCL currently suffers from in my opinion:
Most important unified shared memory that allows to avoid buffer accessors and use plain pointers - similar to what CUDA allows.
However, even with OneAPI i was missing improvements that close the missing texture and interop functionality compared to CUDA.

Also great to here that SYCL-Next plans to make OpenCL optional - this is what i was hoping for a future SYCL realease, since OpenCL feels outdated today IMHO!

regards,
Robert

system · September 10, 2020, 5:07pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.