Feedback: New SPIR-V common intermediate language used by both OpenCL 2.1 and Vulkan

Alfonse_Reinheart · March 16, 2015, 3:24pm

I think you missed my point. As shown when you contradicted yourself:

me: What if they have a function that uses a pointer already?

you: That function should be changed to return a struct and would then get inlined

me: What if they just do their job and write their translator to handle it?

you: w00t one guy did his job correctly, now for all the other implementations

That could just as easily be aimed at the previous case: everyone who’s back-end works a certain way must change it. Thus putting “pressure on the optimizers”, rather than anyone else.

My overall point is that the current state of things only “puts pressure on the optimizers” for systems who’s back-ends work a certain way. If the back-end happens to work SPIR-V’s way, then there’s no pressure at all. Therefore, your suggestion is only reasonable if you have knowledge that back-ends are more likely to work your way than SPIR-V’s way.

So please present said evidence.

Equally importantly, this “pressure on the optimizers” must already exist. Why? Because any decently implemented back-end must accept the very real possibility that the user will want to pass return values back via pointers rather than through returning structs. Maybe it’s an in/out value. Or whatever. It’s the user’s prerogative, and therefore, it’s the optimizer’s job to optimize that.

And if it can handle it in the function case, there’s on reason why it can’t in the opcode case. Since this work must already be done… there’s no point in changing this here.

ratchet_freak · March 16, 2015, 3:49pm

the only reason modf even uses the out parameter is because the C function it copies does. That function was created in ye olden days where memory was faster than a flop and caches were a pipedream, then pushing that result to memory for later retrieval still made sense.

mbentrup · March 17, 2015, 12:17am

IMHO the big issue with returning a struct is that the operators are overloaded, but SPIR-V does not provide overloaded structs. So the spec would have to be something like modf returns a TwoDoubleStruct if the argument is of type double or a TwoFloatStruct if the argument is of type float, etc.

I think the logical way to get rid of that pointer is to allow operators to define multiple result ids.

OTOH I’m pretty sure that the presence of a “pointer” in the IR doesn’t mean that the value will be stored in external memory, the GPU usually has a large register set and unlike x86, it can be accessed indirectly (so you could pass a reference to a register on the GPU).

ratchet_freak · March 17, 2015, 2:07am

[QUOTE=mbentrup;31186]IMHO the big issue with returning a struct is that the operators are overloaded, but SPIR-V does not provide overloaded structs. So the spec would have to be something like modf returns a TwoDoubleStruct if the argument is of type double or a TwoFloatStruct if the argument is of type float, etc.

I think the logical way to get rid of that pointer is to allow operators to define multiple result ids.

OTOH I’m pretty sure that the presence of a “pointer” in the IR doesn’t mean that the value will be stored in external memory, the GPU usually has a large register set and unlike x86, it can be accessed indirectly (so you could pass a reference to a register on the GPU).[/QUOTE]

Or have the spec contain something like

“ResultType must be a structure type with 2 members, both members must be the same type as x. Member 0 contains the fractional part, member 1 the integral part.”

mbentrup · March 17, 2015, 5:24am

Yes, but why add all this complexity ? The convention SPIR-V uses for single return parameters is to define a return id in the operator, so returning multiple values should naturally be encoded by multiple return ids.

ratchet_freak · March 17, 2015, 6:05am

I don’t see them add the option to have multiple result IDs. Especially for only some extension opCodes.

Remember the opcodes we are talking about are in the extensions and uses OpExtInst to execute which only provides a single return ID and its return type.

That return type can then specify it’s a structure type to specify it’s a multivalued return. and use OpCompositeExtract to pull out the values

jimteeuwen · March 21, 2015, 9:04am

This feedback forum is not immediately obvious from the SPIR-V registry on the khronos website. Instead it links to the Khronos bug tracker for feedback. As such, a few bug reports have been made there with suggestions and observations.

Could someone please check those out?

Thanks in advance.

ratchet_freak · April 3, 2015, 1:37am

a new revision of the provisional spec has been posted along side header files for the constants

Approach · April 3, 2015, 2:03pm

Hi, very much looking forward to future SPIR-V releases. A few comments and features I’d be interested in are listed below.

I’ve been very disappointed in OpenGL’s 2d support for years, and increasingly so on embedded devices. On all major OS I can use the platform headers to get the front buffer directly but you cant really operate on even the back buffers directly with OpenGL/GLSL. Give us command support for this in the graphic operations. If people dont know how to handle it they don’t have to. Its no problem for drivers. I dont want to always set up a whole dang 3D shader pipeline just to spit out pixels to a texture and blit it to a draw anymore…and I’m not alone.

-Similarly, GPU-side draw call. For photo and video editing I can’t even explain how useful this would be. Possibly allow us to set a duration for draw timing. Or mayber we could tell the driver, ok I’ll get 4 frames pre-rendered into these buffers just keep drawing on rotation unless this register I set up says DEVICE_DRAW_STATE_INVALID or some such, then check with the CPU callback. Various approaches possible, any would be lovely.

In the presentation there was some question about the usefulness of compatibility with LLVM’s representation. I think this is very useful indeed because many IDE tools are capable of taking advantage of LLVM’s AST already in fairly advanced ways. This could prove especially helpful in developing new tools with similar functionality targeting SPIR-V.
Even though there’s no goto / jump statements, I think it would be useful to be able to label individual lines or perhaps have comments. Really I think comments would be useful for adding further meta-data while avoiding the need to create our own file formats. Generated SPIR-V could be spiced with IDE cues without actually affecting the translated machine code. Perhaps you have a mechanism for this and I just glossed over it.
Developer-defined memory models. With the ability to control how memory is laid out and filled with data, or even being able constrain a pre-existing model in certain ways, developers could essentially have free garbage collection in C and C++ with some work.
Even though none of them are all that hard to emulate, I agree that thread_local storage, exceptions and basic reflection would be super nice, plausible and worth it.

ratchet_freak · April 3, 2015, 2:46pm

[QUOTE=Approach;31341]Hi, very much looking forward to future SPIR-V releases. A few comments and features I’d be interested in are listed below.

I’ve been very disappointed in OpenGL’s 2d support for years, and increasingly so on embedded devices. On all major OS I can use the platform headers to get the front buffer directly but you cant really operate on even the back buffers directly with OpenGL/GLSL. Give us command support for this in the graphic operations. If people dont know how to handle it they don’t have to. Its no problem for drivers. I dont want to always set up a whole dang 3D shader pipeline just to spit out pixels to a texture and blit it to a draw anymore…and I’m not alone.

-Similarly, GPU-side draw call. For photo and video editing I can’t even explain how useful this would be. Possibly allow us to set a duration for draw timing. Or mayber we could tell the driver, ok I’ll get 4 frames pre-rendered into these buffers just keep drawing on rotation unless this register I set up says DEVICE_DRAW_STATE_INVALID or some such, then check with the CPU callback. Various approaches possible, any would be lovely.
[/QUOTE]

That belongs on the vulkan side of things, with its gpu-side command buffers a “wait until vBlank” should be possible. Also 4 frames of latency is a lot for an action packed game

I’ve been told that spir-V to LLVM is very easy to implement

that’s already possible with the opLine opcode, it sets the line number and column number in a “file” that file could just be a list of comments that are being referred to

[QUOTE=Approach;31341]

Developer-defined memory models. With the ability to control how memory is laid out and filled with data, or even being able constrain a pre-existing model in certain ways, developers could essentially have free garbage collection in C and C++ with some work.
Even though none of them are all that hard to emulate, I agree that thread_local storage, exceptions and basic reflection would be super nice, plausible and worth it.[/QUOTE]

remember that spir-V has to run on today’s openGL ES 3.1 compatible GPUs and as such is constrained by the lowest common denominator, they don’t necessarily have room for the meta data needed to to manage all that

Alfonse_Reinheart · April 6, 2015, 12:08pm

I’d like to thank the SPIR-V folks for not only updating the SPIR-V preliminary specification, but also for keeping the bug database up-to-date. It’s always good to see a bug database where bugs are quickly seen, assigned, acted on, and closed, where appropriate.

ratchet_freak · April 29, 2015, 4:24pm

I believe I found a way to declare subroutines:

subroutine uniforms are imported function declarations

subroutines are exported functions of the same function type.

Then in the api a spir-V module with a imported function is not fully linked but you can specify which exported function it imports.

In openGL this can be transparent using the existing subroutine api. In vulkan a rendering pipeline will need a fully linked shader; failing to explicitly map the imported functions to an exported function should cause a failure.

In fact a way to map the exported to imported functions even if the strings don’t match will be helpful in general.

Alfonse_Reinheart · April 29, 2015, 6:35pm

Shader subroutines ultimately represent (I think) some kind of specific hardware within the system. As such, while your suggested implementation would be functional, I don’t think it’s in the spirit of the construct.

It would also represent lots of recompiling of shaders. And I rather suspect that the point of subroutines is that it’s (much) cheaper than recompiling shaders.

ratchet_freak · April 29, 2015, 11:22pm

[QUOTE=Alfonse Reinheart;31485]Shader subroutines ultimately represent (I think) some kind of specific hardware within the system. As such, while your suggested implementation would be functional, I don’t think it’s in the spirit of the construct.

It would also represent lots of recompiling of shaders. And I rather suspect that the point of subroutines is that it’s (much) cheaper than recompiling shaders.[/QUOTE]

OpenGL drivers recompile at the drop of a hat anyway (some gpus need a new program when you change blending state, some need a recompile when you change vertex layout, etc.) And drivers for gpus that have subroutine capability should be able to use it anyway.

In vulkan this recompilation will only happen during pipeline/commandbuffer setup; a known expensive operation where it also optimizes the command buffer, why not let it take some time to inline the subroutine function calls.

jmh530 · July 10, 2015, 6:29pm

I recently started learning OpenCL. I thought I would make some effort at SPIR-V, but I find that a lot of information is more advanced than I can understand.

One thing in particular I was confused about was the choice to use a binary representation (for instructions, I think). My sense of what I’ve read is that this choice was that the binary could provide better protection (IP, etc). Beyond this reason, is there a significant advantage as compared to how the original SPIR? For instance, is there a performance improvement? Further, would there be any benefit to 64bit when that is supported?

I had another question, but it’s not really specific to the specification, so feel free to ignore. I have read that there is work done on a SPIR-V to LLVM compiler, would it be possible to do something equivalent to building something on top of SPIR-V that is less portable but implements the system/io-related stuff that might be in LLVM but not SPIR-V?

Alfonse_Reinheart · July 10, 2015, 9:27pm

One thing in particular I was confused about was the choice to use a binary representation (for instructions, I think). My sense of what I’ve read is that this choice was that the binary could provide better protection (IP, etc).

That’s not really the point. SPIR-V as a language is easily processable, so users will be no less able to deprocess SPIR-V than any text-based intermediate representation. Text IRs don’t necessarily use human-readable names for objects.

The main reason for imposing a binary format is compilation performance, particularly the conversion from the input data to some form of AST or call graph or however one wants to internally represent the code. If you use text, you have to do a bunch of textual parsing stuff, comparing characters and strings and so forth. With binary, it’s a switch statement.

The binary format also makes it easy to jump over irrelevant data. For example, if you have some SPIR-V that contains text names for things (so that you can have decent debugging), if that SPIR-V were in a text format, then the parser would need to recognize that. It’d have to store identifiers and recognize them later. With binary SPIR-V, you can still label things, but the compiler can easily ignore such opcodes. Thus, in release builds where debugging information is not necessary, it’s easy for such a compiler to just ignore all OpName, OpString, OpLine, etc opcodes. That’s a lot less easy in text.

Further, would there be any benefit to 64bit when that is supported?

SPIR-V already has 64-bit addressing. So I don’t see how a text format would facilitate that at all.

would it be possible to do something equivalent to building something on top of SPIR-V that is less portable but implements the system/io-related stuff that might be in LLVM but not SPIR-V?

That rather depends on what kind of “system/io-related stuff” you’re talking about. Remember: the OpenCL execution model is limited compared to general systems level programming.

Alfonse_Reinheart · August 12, 2015, 12:44pm

A new preliminary spec is out. I’d guessed that 1.0 would have been done by now, but apparently not.

Really good things that happened:

You gave us an XML file defining the various integers associated with things, as well as several pre-generated headers/modules for a very wide variety of languages (though did the Lua one have to make the table global? ). This will make automated generation so much easier.
No more texture variables. Textures are just a specialized type of “images”. This makes the language so much cleaner, and probably works better for OpenCL.
Actual linkage. It’s very good that SPIR-V has linkage as a first-class feature. And it’s also good that it’s not restricted to Kernel, so that Vulkan could use it.

Things I’m less sanguine about:

Multiple entrypoints can be declared in the same module. I’m kinda “meh” on that one, because it stronly implies that Vulkan will require you to shove all of your shaders for a pipeline into one SPIR-V binary. And while SPIR-V is a fine IR, it is not easily compose-able.

It sounds suspiciously like rewinding things back to the bad-old-days before we have ARB_separate_program_objects…

You still haven’t fixed bug 1330. Granted, there’s a TBD on defining “uniform control flow”, but “dynamically uniform” is still not strictly defined.

ratchet_freak · August 13, 2015, 3:02am

And to find which input/output belong to which entrypoint you need to expand the static callgraph and see where they are used.

It’s also light on the details of what happens when 2 stages use the same input/output variables.

(that’s also my point in my other thread especially now that multiple entry points are allowed)

Though to me it sounds more like “lets copy hlsl where a single file can contain the entire shader pipeline”.

Alfonse_Reinheart · August 13, 2015, 5:39am

To me, it seems more like they were trying to support the ability to have multiple kernels (a common OpenCL practice), but never really gave much thought to the different needs of shader entrypoints.