Vulkan feedback (my experiences with the API and ecosystem)

Sorry if this is long, but I want to provide feedback on my experience with the new API and the ecosystem around it from an outside (not part of the ARB / Khronos) perspective.

A short background: I’ve been using OpenGL in some capacity for around a decade. I’ve worked on OpenGL software projects, and I’ve been involved with driver development and bug fixing. “Playing” with OpenGL has been a hobby of mine for some time. I’m no guru, but I’m somewhat competent.

The Vulkan API itself is not easy to dive into. That’s expected, and I don’t consider that a problem. I spent a week or so of my off-work time writing a “hello, world” triangle program from scratch so that I could walk through the API and see how it feels. I have since moved from that to a very simple tessellated surface renderer (a few dozen quads with on-GPU Perlin noise perturbation of the evaluated vertices). I don’t mind the complexity of the API - it makes me think more about what I’m doing and why, and (I hope) it provides some insight into how to feed the hardware optimally.

The ecosystem strikes me as pretty raw - meaning: not refined.

The Day 1 availability - specs, drivers (NVIDIA, in my case), loader/SDK - worked fairly well (other than the spec PDF missing a chunk of the WSI extensions due to a typo - rectified within a few days - and my unfamiliarity with how to find the missing text in the GitHub repo). Since then, it’s become a little less ideal.

The new drivers that released this week don’t work correctly for me - meaning: the tessellated surface that rendered okay for me with NV’s 356.39 Vulkan Beta drivers does something weird with the 364.xx drivers (the tessellated surface that was the “ground” is now rendered vertically, like the Y/Z axes changed - but the validation layers don’t report anything differently, and changing back to the older drivers without changing my code fixes the problem).

Rebuilding the Vulkan loader was painful, and I’m not convinced it was correctly compiled, since I can’t get the lxml python library to install (I get error messages from pip about how the library’s “wheel” is not supported, or something - I’m not a Python guy). As far as I can tell, it completed compiling, but MSVC threw a lot of warnings on the command line during the execution of the ‘update external sources’ batch file - it looks like it was building the library. Recompiling with MSVS 2013 didn’t generate those warnings, so I don’t know if I did it right. However, using the current 1.0.5 version of the API headers, I can’t create a VkInstance with the NV drivers due to a version error. If I replace the 1.0.5 vulkan.h with 1.0.4 (rebuilding the loader, of course), the drivers play along, but I still get the screwy problem with everything being rendered 90 degrees off.

So, with NV 356.39, I can create a Vulkan 1.0.3 instance, but neither 1.0.4 nor 1.0.5. With NV 364.xx, I can create Vk 1.0.3 or 1.0.4, but not 1.0.5. I maybe misunderstand the spec where it says “Differences in this version number should not affect either full compatibility or backwards compatibility between two versions, or add additional interfaces to the API”?

Having to regenerate the specs document myself is another pain point (more tools installed to do this). Since GitHub has a reasonable release versioning process, it seems like Khronos could take the extra time to generate the new PDF file and make it part of the GitHub release for “official” releases where a version number gets bumped, instead of forcing end-users to download various tools and generate the docs. Maybe, since I’m approaching this as an independent developer, and not part of a large team, I’m not a targeted end-user for the API (assuming the larger teams have the toolchain support to generate the docs themselves). I’d hope that’s not the intent, however.

Maybe these are all expected, part of the teething problem of a new API being thrown into the world and subjected to myriad end-users. I can accept that explanation, but I want to make sure that the folks on the ARB are aware that some people outside the ARB are finding some pain points adapting to the new API because of the ecosystem surrounding the API, with the hope that some of these points can be addressed to keep the API approachable.

It seems to me that most of your problems are with NVIDIA’s beta drivers. So named because they’re beta.

Since GitHub has a reasonable release versioning process, it seems like Khronos could take the extra time to generate the new PDF file and make it part of the GitHub release for “official” releases where a version number gets bumped, instead of forcing end-users to download various tools and generate the docs.

Are you saying that the PDF hasn’t been updated with changes to the specification?

Within the context of the rendering issue, perhaps. The difficulty I am having there is that the beta driver (356.39) works as I expect it to. The 364.xx drivers (which are regular release drivers whose release notes indicate “Support for Vulkan API”) behave differently - in particular, as I noted unclearly, it’s like an extra rotation around the camera’s Z axis is applied before everything else, despite no code changes. I’ve since checked the demo cube program, and it works, so there’s got to be something I did incorrectly that the beta drivers let me get away with doing.

The PDF was updated for 1.0.3 core + WSI during the initial week, with the missing chapter / partial chapter restored, on the Khronos website. I don’t know if it’s been updated since then (there isn’t a full version number available in the PDF file due to a problem documented on the API documentation GitHub repository, the version isn’t part of the published file name, and I don’t recall seeing any indication last time I looked that the PDF on the Khronos site was updated). What I was attempting to say is that it would be nice to include the PDF spec, already built, with the GitHub release snapshots. Currently, the releases are snapshots of source used to generate the documents. What I propose is attaching the PDF to these snapshots as a separate download, instead of requiring the end-user to go through the hoops of setting up tools and generating the documents themselves.

I am/was kind of in the same situation as you, so… some Q&A(in order as they come up):

  1. The ecosystem is raw. I think the release target was pretty much first (reasonably) working drivers and loader and set of layers. That’s paradoxicaly more than you get on each new version of OGL So things are allright and will only get better.

  2. Well Beta Drivers are Beta. My AMD does ugly things too (e.g. crashing when it gets nullptr ApplicationInfo, which should be allowed).

  3. At the moment the html spec version is better (regulary updated as stated in the document header and has nice and working hyperlinks). It simply works to do “save page” in a decent browser, if you need offline version.

  4. I managed to successfuly build both the spec and the loader on Windows without that much hurdles. So I can give you exact instructions, if interested.
    Hurdles: The spec (and extensions) is best build in cygwin (!!x64 version!! with the specified dependencies and versions installed). The loader must be build in the VS Developer Command Prompt and not in the regular cmd.exe. Also it must be git cloned(simply downloading zip from github didn’t work for me).

  5. There are pretty much two/three kinds of versions. There is the vulkan.h version(which I think is not mentioned in the spec) and also loader version. And there is the driver version. My understanding is that it is the developers responsibility to match those two (if they are using the supplied header and loader). The 1.0.X versions should be bug fixes so should be sort of compatible. Of course there could be errors in the header or the driver or the driver expecting there being error in the header and complying with that. My current strategy is to use highest X version header and whatever I get from the driver as long as the major and minor version matches.