Fast accumulation?

I’m rendering a large number of GL_POINTS with various 1-channel colors. I’d like to end up with pixel (i,j) containing the sum of the colors of all the GL_POINTS which had position (i,j).

glBlendFunc(GL_ONE,GL_ONE) is far too slow for my purposes, and glAccum is rather inefficient for my needs.

Is there a hardware-accelerated way to do this? It seems like there must be…

My vertex program moves points around within rows, but not between rows. There may be a way to exploit this.

glBlendFunnc(GL_ONE, GL_ONE) is the proper way for this.
Are you rendering to floating-point texture?
If yes, then make sure your hardware supports blending to such texture.
GeForce 6 / Radeon X1k will support FLOAT16 texture blending.

Hmm, GL_RGBA16F_ARB does indeed seem to be fast enough. I was hoping to use GL_FLOAT_R16_NV since I only need one channel, but apparently hardware blending isn’t supported for that.

AFAIK you can render only to RGB/RGBA textures on GeForce 6/7.

Well, you can definitely render to GL_FLOAT_R16_NV; it’s called out as one of the only color-renderable 1-channel formats. It just doesn’t support hardware blending.