along with vertex, fragment and geometry programs, also have the ability to write a blend program. Where there are two defined input color objects and one output color.
Then, change the glBlendFunc() function to allow the programmer to specifiy the blend program handle, instead of an enumerated definition of predefined blend functions.
Or… Change the restrictions of the fragment program, so that gl_FragColor is assumed to be populated with the existing fragment’s color data and is readable in a fragment program.
This is a limitation of current hw. Eventually, they might add “blend shader” or you would be able to use gl_FragColor and make some extension available. I’m guessing they will add another programmable stage that can override glBlendFunc.
I think reading from and writing to the same memory location is exposed in CUDA, so it would be possible on the G80. I expect that AMDs GPGPU solution exposes a similar method - I can’t find any useful FireStream documentation on their website.
IMHO, the problem is API compatibility. If blending is in the fragment shader, then there has to be a way to enable fragment shader blending. How does that interact with normal blending? What happens if you load a shader, then disable fragment shader blending? Having a separate blend shader is cleaner from an API standpoint.
Either method should work on Mt Evans-class hardware, as those GPUs support full gather/scatter.
Your point reminds me of something. Ever since I started fragment programing, I always felt that it was wrong for things like stencil-test, alpha-blending and color blend functions to be events outside the realm of the fragment program.
It just feels wrong.
If I had my druthers, I would let the fragment program do just about everything except early-z trivial rejection.
If I want alpha values to represent degrees of transparency, then I would write my fragment program that way.
If I want to short-circuit the fragment program because of the stencil value, I would write my fragment program that way.
If I wanted colors to add together when applied, I would write my fragment program that way.
Rather than set some state flags outside the scope and knowledge of the fragment program.
it were events outside the realm of the fragment program because the chips were not able to do it programmatically
While the current state reduces the set of operations you can apply, it also has many advantages.
[ul] The GPU can early reject fragments before they consume shading resources and at faster rates. It can optimize memory accesses done during the blending operation based on blend function and input values. While this might be theoretically doable with the shader, it would use more instructions and the shader writer would have to write the code to do that.[*] It works with multisampled antialiasing in which case it allows features like alpha to coverage mask.[/ul]
[ul][li] It works with multisampled antialiasing in which case it allows features like alpha to coverage mask.[/ul] [/li][/quote]
This is an important point. If I were to make a guess where we’re going to end up, I’d guess we’ll see a blend shader. The reason is that the blend shader can run on a per-sample rate, like blending does now. This would not be possible with a shader readable variable. Or at the very least, you’d need an array with all samples in it.