GLSL shader management - splitting and merging

I have a cross-renderer supporting D3D9 and OGL. I’ve been using Cg for shaders. The only problem I had was that shaders with GLSL profiles didn’t want to compile. So I decided to use NV40 profile for NVidia GPUs and ARB for any other. I didn’t like it cause ARB doesn’t support SM3.0. Fortunately, I found the solution to make GLSL shaders run - I need to merge vertex and pixel shaders (which are seperate programs in my shader management system) into one shader. And here is a problem, cause I have a few dependent shaders. For instance, I have a group of 3 vertex shaders which must me used with 4 pixel shaders. Summed up it gives 7 independent shaders, yet GLSL makes me to make every combination, which is 12 in this case!
What would you advice now - merge the shaders and handle more of them, or handling vertex and pixel shaders separately and only in the final stage merge them in run-time in one program before rendering? This would be quite convenient solution but I’m not sure what would be performance-penalty-shot of sendind quite many shaders per-frame to the GPU. Any advice?

IMO , only shaders of one type an be merged. You can not merge a fragment shader with a vertex shader as they refer to different stages in graphics pipeline.

You can however, attach and link to a common program object.

12 combinations => each vertex shader is independent of the other (i.e 3 vertex shaders are not dependent on each other) , same with fragment shader.

But if there is a dependency then only one shader per type can have a main function. In this case you can attach two dependent vertex shaders and one or more dependent fragment shaders to a common program This way number of combinations can be less.

Other way is to merge all vertex shader to one and all fragment shaders to one. Then depending on your program case, select functions from each shader type that is/are required.

Hope this helps.

Every shader is independent and has it’s own main function. So I can only merge one vertex shader with one fragment shader (by merging I mean linking).
And this “12” number is what I want to avoid. If vertex/pixel shaders are treated separately, is in case of D3D (and OGL with NV40 profile) then I don’t need do any merging and I have only 7 shaders.

Getting you a bit. But first, I do not have experience on D3D and not used NV40 profile.

In your case you can compile your vertex and fragment shader objects (i.e 7 in total ) and then attach to your program object as required, then in the next loop/time, just detach the shader object you think you won’t be using and attach another one( the one you think will be used).

You can attach / detach shader objects to program object at any time. Only restriction is that the everything must be done before actual use of the program object.

For more refer to GLSL specs. It mentions some points worth reading.

Optimizations and lots of compilation are done when linking the shaders. So, doing the linking while your app should push 60fps is not recommended.
That’s just the way OpenGL is; it’s been protested against a lot, and there’s only the tiny hope that separation will be possible in the very distant future.

Btw, iirc with some drivers it’s a problem to reuse the shader (whose handle is used by glCreateShader/glShaderSource/glCompileShader) in more than one program (glCreateProgram/glAttachShader/glLinkProgram) . It could be already fixed, or be only for nVidia drivers (where you’ll be using NV40-asm instead).

In your case you can compile your vertex and fragment shader objects (i.e 7 in total ) and then attach to your program object as required, then in the next loop/time

And I’d be linking the programs all the time… This is what I considered as I stated before, but I guess Ilian Dinev is right and it’s not a good idea.
I think I’ll try some mixing and I’ll be merging for GLSL profiles only, while still keeping my shaders separated for NV40 and D3D renderer. But I got really annoyed with this what is strongend by the fact that my simple demo using GLSL works great on my NVidia (who would have doubted it?) and on the other machine with Radeon it simply doesn’t. Maybe it’s becuase of half-year drivers whereas I’m using the newest Cg 2.2. Gotta check it out with new drivers.
By the time, thanks for response