Enslaving the graphics processor

Hi there, do you suppose it is possible to use the graphics processor as a slave to the CPU for computing FFTs via openGL? Background is, I want to squeeze every bit of performance out of existing systems and thought that would be a nice thing, but first I don’t know whether this is possible at all, and secondly with openGL I don’t have a clue.

Thanks for your patience



Sorry, FFT = Fast Fourier Transform, basically a method of trying to get the frequency part out of a series of numbers taken from a time series (that part has basically nothing to do with graphics, but essentially this can be performed just by adding and multipliying complex (or real) numbers)

Thought about using MMX for it?

I presume you need the results for something else… you also considered that you might saturate the bus, moving all that data.

I remember a guy, on the OpenGL maillist, proposing the use of the graphics adapter for sound manipulation. A sound idea, but it wouldn’t work in practice for several reasons.

MMX is already considered it’s just to increase the effectiveness and a few days ago I read that the GeForce3 might have a peak performance of several GFlops we thought it could be quite nice to use this also, especially for multiple FFTs of the same data of different lengths the bus shouldnt be a big problem. But anyway, thanks a lot.

Even though the GeForce3 can reach a few Gflops, it’s not the same kind of Gflops as in standard CPU’s. The GeForce is a GPU, which is highly optimized for graphics, and not indended to be a math unit the user can use like that.

You might be able to write a vertex program to do a FFT, and then read the data back in feedback mode.

This would work on any GeForce, but I think that feedback mode is always software, which uses the CPU instead of the card.

Not to mention that I don’t know if a vertex program could actually do a FFT.

Best to just optimize for your CPU as much as possible.


Additionally, a vertex program (or anything done on the graphics card) is probably going to have too limited precision for what you need to do.

For fast FFT you can read some articles for get the most profit of Intel processors from Intel website. Intel also gives free the fastest ever library for loading JPEGS. If your application uses it (it’s pressumible) don’t doubt and try it.

I’d also like to point out that a 3DNow! enhanced FFT routine (actually a 3DNow! enhanced dsp library in general) is also available from AMD.

Thanks a lot for your opinions, which shed some light of my missing knowledge (therefore a void ;-)).

We aer already optimizing for Intel, AMD, Sun and Alpha, so this was just smoe additional thought.

There’s also intel-optimized libraries for
fft and practically any other computational thing you’d like to perform. I believe their latest incarnation is called something like “Intel Performance Primitivies”

You also might want to check out the Fastest Fourier Transform in the West: