Hi, I’m a Blender user, and I have a problem where if I have a complicated material which has already been compiled by opengl to machine code, adding a single node to blender’s shader graph results in opengl having to recompile the entire glsl to machine code again, instead of just modifying the existing machine code which is practically identical apart from one extra bit of glsl. Shader compilation is taking around 30 seconds per material. I initially thought the slowness was due Blender not re-using glsl, but the devs have assured me it’s an opengl limitation:
Is there anything that can be done to get opengl to ammend the existing machine code rather than recompiling it all from scratch? Effectively only updating the bits that are different, and removing bits that are no longer represented in the glsl which blender passes through.
I’ve seen very large GLSL shader programs (all stages) with a lot of unused bloat in them that take ~300-500 ms to compile+link. But 30 seconds is outrageous,
You should dig into this and see what Blender’s doing for all that time.
How long is its shader graph -to- GLSL conversion taking?
How long is it taking to compile the GLSL source code for the shader stages in a single program?
How long is it taking to link the compiled shader objects into that single shader program?
Is it generating+compiling+linking a bunch of shader programs rather than just one? How many?
Is it sending a lot of unused bloat in the GLSL source code (**)?
What CPU and GL driver is this on?
If I were you, I’d want to know.
** Re bloat. I’m talking about lots of comments and uncalled/unexecuted code in the GLSL which adds needless content that the GLSL compiler has to wade through, parse, and ultimately just chuck into the bitbucket.
If you can, get a hold of the GLSL source for the shader stages in that program (prob just vertex and fragment) and post it if you can. I’d like to see this.
More correctly, adding a single node results in Blender demanding that OpenGL recompile the entire GLSL shader to machine code again, from scratch.
Is Blender using separate shader objects? Or shader subroutines? Or assembly shaders (including SPIR-V)?
Or is it re-providing new GLSL source code for the entire shader program to OpenGL to compile and link at runtime, where the graphics driver has no choice but to do exactly what the application demands.
Also re the comments in that Blender thread you linked to about a “shader cache”… Blender could read back compiled+linked shader binaries and save them off, forming its own cache, stored someplace such as in the .blend file or associated file (e.g. see ARB_get_program_binary). But if I infer correctly from that thread, they’re implicitly saying Blender does not do this but instead hopes/expects that the OpenGL driver implements some kind of compiled GLSL shader program cache like this. Some do.
It’s premature at this point to jump to this as a potential solution (because we don’t really know what the problem is yet). But just FYI…
The GPU vendors stopped supporting ARB assembly shaders years ago. NVIDIA held out until just a few years back with their vendor-specific (NV) assembly shaders, but recent GPU features are no longer supportable through that assembly language; they’ve switched to SPIR-V. So you won’t be patching those assembly shaders anymore if you want cross-GPU support or support for the latest features.
SPIR-V assembly is another possible option GL supports for loading shaders (bypassing the need to compile GLSL in the GL driver). But it doesn’t support some important GL features like bindless texture, so that makes it a non-option for some GL users.
Without re-generating shaders at the assembly level, you make the most of the support you’ve got through the GLSL high-level shading language. Let’s see how effective Blender’s use of it is.
I don’t know, but I wouldn’t expect there to be a noticeable difference. A portable binary format such as SPIR-V doesn’t eliminate much beyond parsing, and a modern CPU can parse at many megabytes per second. And GLSL programs aren’t measured in megabytes. For compiled languages, it’s optimisation that takes most of the time (in C++, template matching and instantiation is an issue, but there’s nothing similar in GLSL).
What Vulkan gets you is the ability to generate object code during pipeline creation, whereas OpenGL has to wait until you issue a draw call to obtain enough context to optimise it. But that doesn’t help an application which is generating shaders on the fly.
Thanks, some great points. I’m not knowledgeble enough in blender’s open source code. I’ve asked a couple of the devs if they’re able to answer. The chief programmer, brecht van lommel suggests that the entire compilation waiting time is opengl converting the glsl which blender presents it, into machine code. Not sure if you’re a seasoned developer, but here’s the source if you’re skill level enables you to pinpoint the actual issue:
But it’s not clear from that discussion if Blender is only compiling a single shader program or multiple such programs all at once. That is, Blender could be building lots of OpenGL shaders based on a single “Blender shader”.