glsl "offline" compilation

Originally posted by Korval:
Give up the ridiculous idea that your shader IP is safe. It’s not, in any way, shape, or form. If someone wants your “precious” shader logic, it’s going to be theirs. Trying to fight it is futile.

You can’t do nothing vs a pro hacker, that’s for sure. BUT you can avoid some curious eyes looking inside. And again, guess what does my prev ps3.0… I bet you 0.1cents you won’t discover it. On the other hand I can tell you what does your GLSL code very easy!

Originally posted by Korval:
As for an effect system, you can pretty much forget about it. There’s COLLADA FX, but it will never be something that gets incorporated into OpenGL itself.

Well glFX in theory is the new GL effect system, inspired in the Microsoft one. But yes you are right, is not incorporated into the OGL “core”(which I think is good, in the same manner that D3DX is not D3D). What nobody know is if incorporates code reflection, encryption, obfuscation, fragment linking, etc… and that’s what I was hoping for.

Originally posted by Korval:

Two, if you honestly have enough shaders that it takes hours to compile… you’ve got problems. If it takes longer to compile your shaders than it does your game’s source code from scratch, that’s an issue I have no sympathy for you over. You clearly have too many shaders and should look to that.

I have no control over this. We are using a popular 3rd party engine that generates shaders based on artist configs. I don’t want to say any more for fear of breaking a NDA.

Originally posted by santyhamer:
And again, guess what does my prev ps3.0… I bet you 0.1cents you won’t discover it.
Just to try. I think it does following (you are free to correct me if I am wrong). Some selections/factors/colors stored in constants are not mentioned.

  • First it optionally (b0&b1) applies basic paralax mapping.[]Then it applies possibly animated alpha kill (e.g. animated disintegration of dead character)[]Selects (b2) between vertex normal or per pixel normal . The per pixel normal map can be tangent space or (c11.x) object(world?) space .[]Optionally (b3) applies some form of outlining. Color is interpolated between ordinary base texture color and color combined from two different textures (cube indexed by reflected eye vector e.g. environment and texture mapped on the object) based on (1 - cosine of angle between normal and direction towards eye)^5 or (c12.x) on alpha from on object texture.[]Calculates diffuse and specular lighting. Additional toggle/factor/color constants involved.[]Optionally (b4) applies shadowmap. 6 averaged offseted comparisons used as multiplier. Seventh sample whose depth (0-1) is multiplied with that average and added to the result of multiplication of the average with the lit color. Probably some distance based shadow fadeout modification although I would expect for it to be added to the average before the multiplication.[]Adds ambient constant color and ambient cube.

I don’t see why multiple solutions can’t be offered. The developer chooses whatever he prefers.
I have a lot of shaders, something like 50 but compilation is quicka and it’s obvious the driver isn’t caching. What happened to the idea of the driver making a blob and keeping it on the HDD?

If it is a problem for some people, then why not offer these alternative solutions?

Personally, I would prefer a precompiled thing even if I had 1 shader. I don’t like text based solutions .

Originally posted by V-man:
I don’t see why multiple solutions can’t be offered. The developer chooses whatever he prefers.

Korval fears, and I think he is right in that, that by providing two ways to do this, we will end with two ways that are of lower quality (e.g. more bugs, slower) because IVH would need to focus theirs efforts to two places instead of one. This is similar to current state of OGL where there are for example many ways to specify vertex data instead of one which would get all optimization effort.

This is similar to current state of OGL where there are for example many ways to specify vertex data instead of one which would get all optimization effort.
It’s not the same thing.

It will compile the GLSL to a intermediate format (text asm).
The intermediate format can be binarized.
The binarized turns into a GPU specific binary.

So there are 4 levels. It is just a chain of events.

It’s not like glVertex, vs glBindBuffer vs glInterlaced vs display lists because in this case, some paths are simply not used by most people.
Those 4 stages I mentioned will be used by most of us.

Well, sqrt[-1] is in deep deep… trouble

Originally posted by V-man:
[b]
It will compile the GLSL to a intermediate format (text asm).
The intermediate format can be binarized.
The binarized turns into a GPU specific binary.

So there are 4 levels. It is just a chain of events.
[/b]
The GLSL operates in way which allows the compiler full access to all information contained in original source code so it has chance to take advantage of any hw feature. Whether current implementations take full advantage of that is different matter.

Introduction of any independent intermediate format would prevent this because for such format to be fast loadable, many hw dependent decisions (e.g. what to inline, where to use real jump, what to unroll) must be already made during generation.
Even Nvidia which uses intermediate format during GLSL compilation uses HW specific one.

Maybe bunch of vendor&hw dependent formats similar to the DX compile targets or NV_fragment_program* extensions might be the practical solution. Although theoretically not pretty as the current GLSL one. It also has problem with increasing amount of intermediate formats the driver needs to care about as new hardware appears so it maintains backwards compatibility with old applications.

Of course your application will be not able (unlike current GLSL) to take full advantage of possible future hw changes with such approach however it is likely that the future hw will have such horsepower that it will run current applications without problem even if the shaders are bad match for its architecture.

We are using a popular 3rd party engine that generates shaders based on artist configs.
Then your artists have clearly run amok and need to be reigned in. It sounds like your artists are making a separate, unique shader per object, which is never an acceptable idea.

Furthermore, if the artists are not at fault (ie, they’re not making tens of thousands of shaders), then the fault clearly lies with the “popular 3rd party engine.” Contact them and ask why its creating so many unique shaders.

Originally posted by Korval:
Then your artists have clearly run amok and need to be reigned in. It sounds like your artists are making a separate, unique shader per object, which is never an acceptable idea.

It is more likely that the artists create reasonable number of combinations and the engine then generates many shaders for all possible lighting combinations (e.g. two spot lights & one point light, three point lights, various qualities of shadow samplings, various options and special effects and so on) applied to those artist created variants.

If the engine was created with DX in mind then it was probably created with the assumption that the cost of single draw command is high (so it is good if single draw applies as many lights as possible) and most of the compilation cost can be done in offline (so there is no runtime problem with compilation of huge number of shaders). As long as only reasonable number of shaders from that huge set is visible at single time, this is perfectly reasonable design for DX engine.

Originally posted by Komat:
The GLSL operates in way which allows the compiler full access to all information contained in original source code so it has chance to take advantage of any hw feature. Whether current implementations take full advantage of that is different matter.

Introduction of any independent intermediate format would prevent this because for such format to be fast loadable, many hw dependent decisions (e.g. what to inline, where to use real jump, what to unroll) must be already made during generation.
Even Nvidia which uses intermediate format during GLSL compilation uses HW specific one.

Maybe bunch of vendor&hw dependent formats similar to the DX compile targets or NV_fragment_program* extensions might be the practical solution. Although theoretically not pretty as the current GLSL one. It also has problem with increasing amount of intermediate formats the driver needs to care about as new hardware appears so it maintains backwards compatibility with old applications.

Of course your application will be not able (unlike current GLSL) to take full advantage of possible future hw changes with such approach however it is likely that the future hw will have such horsepower that it will run current applications without problem even if the shaders are bad match for its architecture. [/QB]
This argument is only valid if GL code runs faster than D3D.

The 4 layer method that I mention covers all the bases. If you want GLSL, it is available to you.

Backwards compatibility for intermediate formats is not an issue. nVidia is always developing newer low level shader support and support everything all the way back to register_combiners on the 8800. Who is asking them to support that?

This argument is only valid if GL code runs faster than D3D.
The argument is valid in the theoretical domain. The fact that there are poor glslang compilers does not mean that there can’t be good glslang compilers. And those good compilers would be able to beat D3D compilers in shader performance.

The 4 layer method that I mention covers all the bases.
Except performance and complexity, since an implementation is required to implement all 4 layers.

Who is asking them to support that?
Nobody. Which is what makes nVidia’s “infinite support of features people stopped caring about years ago” kinda the problem.

Rather than getting their glslang compiler issues worked out, they’re wasting time and effort supporting outdated nonsense. That’s probably why they won’t bother making their glslang go straight to the metal; it’d take time and effort away from supporting stuff nobody cares about.

You don’t see ATi wasting their time supporting ATI_fragment_shader and other such things.

OpenGL ES supports this kind of thing. It creates portability issues unless you define a common intermediate representation and there’s no guarantee that register mapping/linking, retargeting & optimizing this would be significantly faster than a compile on some platforms (it will vary from design to design).

I still think a intermediate representation is a good idea even if the compile times are the same. (which I highly doubt)

  • Easy to see when the compiler does “dumb things”.
  • Don’t have to worry about code parse bugs. (These should not happen, but do)
  • Dead code/Code folding optimizations can take as much time as needed.
  • Don’t have to worry about spec violations from different vendors (or even changing between driver versions)
  • Easier for driver writers to support. (probably)

I know a lot of these problems do not exist in “theory” but in practice I believe compiling at runtime is adding a huge surface area for failures.

As things currently stand, I would personally not be using GLSL in a commercial app.

Perhaps the Long Peaks people could tell us what their plans are?

Perhaps the Long Peaks people could tell us what their plans are?
Maybe you missed the part where I said “the ship has sailed.

These arguments were made years ago and rejected. It is done.

There will be no intermediate language. End of story.

Originally posted by Korval:
[quote]Perhaps the Long Peaks people could tell us what their plans are?
Maybe you missed the part where I said “the ship has sailed.
[/QUOTE]Sorry, I had no idea you were on the Long Peaks design board. (I have no idea what most of the people do here)

I was assuming that Long Peaks was going to clean up the OpenGL legacy interfaces and fix up past mistakes. (as they can break as much backward compatibility as they want) And with the current state of GLSL drivers, I would consider it a mistake.

I will have to evaluate Long Peaks when it comes out to see how much time it is worth investing in…

Sorry, I had no idea you were on the Long Peaks design board.
I’m not. However, there has been no mention of any significant revision of glslang. And considering the amount of information that’s already out there with regard to LP, I seriously doubt they would have just forgotten to include it.

So we have no reason to expect LP to change significantly the nature of glslang. So you can expect that it will remain a high-level C-style language, and that there will be no intermediate language.

Page 54 of “OpenGL shading language” says that 3DLabs released an open-source front-end for the GLSL compilor that produces a “Tokenised” form of GLSL.
IHVs were supposed to include this in their drivers and then write their own back-end compilor to produce machine code for their own hardware.
If this is how they have actually implimented it then it should be possible to partly pre-compile to tokenised GLSL.
This would be of limited use for IP protection, but would also reduce the size of the stored shader and be (slightly) faster to compile.

Personally, i am more interested in compiling to machine code during program installation as i am already doing this for my main program to take advantage of MMX, SSE, 3DNow features of the system i install to.
We just need a glGetObjectParameteriv( ShaderObject, GL_OBJECT_SHADER_CODE_LENGTH, X ) call to get the size of the blob, and a glGetShaderCode( ShaderObject, *Buffer ) call to copy it to a buffer ready to write to a file.
The header could have a 4 character vendor ID (NV,ATI,SGIS,HP,APPL) a 4 charater hardware model number (8800), a compilor version number, and a blob-length field.
Then just use something like: glShaderCode( ShaderObject, Length, *Buffer ) to load it.
This function would return a boolean result to indicate if it was compatible, so if we get false we just do a normal glShaderSource/glCompileShader.

Actually, i think the most important reason for having this is that it encourages the manufacturers to make a compilor that takes longer to compile but produces a better optimised shader.
Without this they will be under presure to reduce the compilation time, so they wont include any time-consuming optimisations.

With regards to the “Hundereds of shaders” problem, has anyone used the Linking function to compile ‘bits’ of shader and then link them together in different combinations to perform different functions?
This would have to be much faster than compiling lots of separate shaders, especially if there are only small differences between them.

“IHVs were supposed to include this in their drivers”

Not necessarily.

Anyway, I agree with Korval.

Secondly, someone said the number of shaders can explode. Someone said it takes them hours to compile their shaders. Very comedic stuff.

How very dismissive. Yes, the number of shaders explodes. No, linking compiled modules is not faster. Honestly, anyone would think that you hadn’t used shaders in an actual product yet.