How we program parallel processors is not a final answer, but a global research project. CUDA/OpenCL was a good direction, but we’re certainly not there! HSA, OpenCL SPIR and such are also researching new areas closer to the hardware than OpenCL. Even around OpenGL/Direct3D a lot is going on, if you look at the discussions around Mantle. And where do we put OpenGL compute?
It is not clear anymore what can bedefined in what (it is not just about levelness anymore). LLVM has made a lot possible, but can all be defined in LLVM IR? Or can only parts be implemented in something else. Compute-part of OpenGL in OpenCL, rest in LLVM IR only? Can CUDA be implemented in OpenCL SPIR? How low-level is Mantle? Can it implement both compute and graphics APIs, but is it above or under Nvidia PTX/AMD IL? As it is quite extensive (my list is not complete for sure), I’d like to see what you have found out.
Result of this discussion will be an image with a lot of arrows, I’ll share on the StreamComputing-blog with credits to you. Reason for this, is that I see too many comparison between apples and pears (not sure if that’s a Dutch saying only :)).
As far as I can tell, the situation is as follows:
OpenGL and Direct3D have some minor compute part going on, and the shader compiler is quite different as it has to deal with derivatives etc. I’m pretty sure that all OpenGL/Direct3D drivers compile immediately down to the hardware ISA, so there isn’t much of “implement on in terms of others here”. What I do believe though is that Mantle is more or less one level below OpenGL/Direct3D, so it might be feasible to implement them in theory on Mantle – in practice though, Direct3D is also tied to the operating system itself, so there’s not much to be gained here. All of these APIs also contain a huge amount of code to program the fixed-function units. I have a personal theory going that the IHVs have some common implementation layer on which they build their OpenGL/Direct3D APIs, which could be similar to Mantle, but I haven’t seen any proof for this yet (in fact, it seems that both AMD and NVIDIA have started OpenGL/Direct3D drivers separately aeons ago and never merged them, and may never do, due to specific optimizations.)
On the compute side, it’s easier. LLVM IR is not enough, as it doesn’t contain stuff to access texture units etc. OpenCL/CUDA is at the same level, and both have their own IL (SPIR/PTX) which exposes more of the graphics hardware. All compute APIs could be implemented using SPIR/PTX. That said, SPIR/PTX is still not suited for “normal” graphics shaders as far as I can tell, cause it lacks graphics-specifcs stuff like interpolator setup, derivatives, pixel killing etc. I don’t see a reason why CUDA couldn’t compile down to SPIR (as far as I know, OpenCL and CUDA compile to PTX on NVIDIA, and PTX isn’t much different to SPIR in terms of expressiveness.)
HSA is clearly one level below OpenCL, likewise BRIG is one level below SPIR. It might be feasible to implement OpenCL, OpenGL compute shaders, and Direct3D compute shaders on HSA, and potentially CUDA as well. As far as I can tell from the publically available documentation, everything in HSA is exposed through OpenCL 2.0 as well – they seem like a perfect fit. At this point though, you should also consider whether it wouldn’t be easier to translate PTX/NVVM to SPIR/BRIG, instead of having to deal with CUDA itself …
Simplistically I would say that the ordering is somewhat like;
OpenGL/DirectX -> Mantle -
OpenCL --------------------> SPIR -> LLVM IR -> HSA
In that you should be able to encapsulate OpenGL/DirectX in Mantle, and that Mantle should be able to be LLVM IR, and then thus becomes HSA and beyond.
In general the more right you go on the list the more features you have (in that HSA has a crazy amount of control that you don’t get in OpenCL for instance).