With the pipelined nature of videocards, you can not expect to be able to read from and write to the same surface with both performance and correctness.
There have been discussions on possible “blend shaders” that would allow to put custom code at the blending stage. But I guess the hardware is not yet ready for this.
In the meantime you might use render to texture (if you have only some dozens of objects overlapping), each object (sorted by distance or something) to a different FBO, then do the ‘blending’ all in one stage, sampling from as many FBO you can for a given pass.
Can you be more precise about the context for this custom blending ?
The RGB values are dot3 normals, encoded as RGB values.
R = (X / 2.0) + 0.5;
G = (Y / 2.0) + 0.5;
B = (Z / 2.0) + 0.5;
The blend function would need to decode both sets of input colors into 3D vectors. Then add the vectors together into a single vector. Then encode the vector back into RGB values in the output fragment.
For something as simple as that, you can use a FP16 RGBA render-target (FBO), and just use additive-blending.
For really custom blending, maybe 2 FBOs (whatever format), doing ping-pong computations and only copying updated parts by drawing flat textured triangles (one of the FBOs serves as a texture to fill into the other FBO)
Blending on a FP16 render target is supported on Geforce 6xxx and above.
On Radeon 9550 and anything in that generation, blending is not supported but you can create FP16 render target and render to it.
I don’t know which ATI cards support FP16 blending.