Extreme use of GPU...

Hello

I want to apologise if this post may sound stupid. Although it’s an interesting question…

We all know it well that graphics processor can do lot’s of work instead of main processor and so we can achive much better performance when using hardware acclerated rendering. But it is about graphics since GPU was optimized to do graphic operations, right?

So - my question is as follows : can we use a GPU for other kind of processing data? For example not graphics but sounds. It’s a matter of taking a control over GPU to make everything work faster. I know that I can’t just acces GPU like I can do it with main processor unit. What if I can do some tricks…

Let’s say I have a sound data. I want to use some filter on it. Now don’t laugh at me… What if I pass this data as (for example) texture to GPU and after applying some filters or operations get it back and use as a sound again? Is it possible?

Is anything in OpenGL which can make such trick?

Orzech

PS. I am not joking. I was thinking about it for a long time. Be honest, please.

GPU is optimized to work with normalized float4’s & the clock frequency of GPU is a lot more slower than on CPU. I remember that someone mentioned that VP on P4 in sw mode is actually executing faster if you have free CPU time from physics and stuff (500Mhz vs. 3.06Ghz).

Hi,

You can definitely use the GPU for whatever purpose you want it to. For example, Prof.John Hart at UIUC has already used it for multiple purposes. Here is the link to his papers and the following paper particularly explains how they have used the gpu for matrix multiplications (which you could use for filtering).
http://basalt.cs.uiuc.edu/~jch/papers/

Cache and Bandwidth Aware Matrix Multiplication on the GPU

-alark

Originally posted by M/\dm/
:
GPU is optimized to work with normalized float4’s & the clock frequency of GPU is a lot more slower than on CPU. I remember that someone mentioned that VP on P4 in sw mode is actually executing faster if you have free CPU time from physics and stuff (500Mhz vs. 3.06Ghz).

yes but you can use the floating point fragment side on the newer cards for similar purposes, the main CPU would have a hard time competing with that. problem is there would be a ‘granularity’ of specifying varying paramters (akin to csound’s control (k) rate) dependent on the triangle/primitive density.

Ive been thinking about expirimenting with the gpu for processing audio for a while now too. i think its feasable, but you certainly wont have the same flexibility as a general purpose CPU - but it will be faster at some things…

mtm

If i understood the specs right, than the upcoming superbuffers extension allows to send any data to the GPU, let it be transformed (ie. by a vertex-shader) and can then store the result in another (or the same) array. Certainly this array can be read by the application, which would be exactly what you need.
But until that extension gets implemented, you might have to do it in another way.

Jan.

Originally posted by tweakoz:
Ive been thinking about expirimenting with the gpu for processing audio for a while now too. i think its feasable, but you certainly wont have the same flexibility as a general purpose CPU - but it will be faster at some things…
mtm

This isn’t as stupid as it sounds. I have a demo called GPUsynth that does simple sound synthesis on the GPU. It’s easy to do additive/FM synthesis using sin/cos in a fragment program, and you can also do wavetable/granular synthesis using texture lookups. Implementing filters is a little trickier. The GPU is really designed to process 2D data, whereas sound is inherently 1D. Also, most DSP algorithms are iterative, where the output depends on previous results.

At the moment my demo renders to a float buffer and then does a glReadPixels to copy back to a DirectSound buffer, but in the future it may be possible to render directly to a hardware sound buffer (kind of like render to vertex array).

Originally posted by simongreen:

At the moment my demo renders to a float buffer and then does a glReadPixels to copy back to a DirectSound buffer, but in the future it may be possible to render directly to a hardware sound buffer (kind of like render to vertex array).

Nice. I am glad that I am not the only one who think about such “strange” uses of GPU. You said that you use glReadPixels to copy converted data. Is it the only method to get the data back? I am asking because I know how slow are buffer-transfering commands. I was thinking about using GPU to help CPU when it’s not sufficent enough. There are some extreme sound applications which really takes almost the whole power of CPU. For example - have you ever seen guitar effects used through computer? You plug in your guitar into the soundcard, choose a suitable effect and CPU add various filters to the sound coming from your guitar and at the end you hear what you are playing in speakers (everything in REAL-TIME, of course). In this case even a small delay is tragedy.

Tweakoz say that GPU calculations won’t give such precise resuts as CPU. That’s surely true but in this case GPU can be used just for doing some pre-calculation and overeall performance should increase.

Orzech

Won’t the new pci spec. allow for both up/down bus travel without hazzards?

The whole reason AGP was invented, was that it’s faster for hardware to ignore caching/bus snooping hazards. I doubt that the new graphics standard would suddenly force snooping on every bus peer for a high-performance graphics bus.

Can a GPU be used for “other things”? Yes. Is it a good idea? Not very often.

Originally posted by Orzech:
You said that you use glReadPixels to copy converted data. Is it the only method to get the data back? I am asking because I know how slow are buffer-transfering commands.

If there’s one thing that isn’t a problem with doing audio on the GPU it’s readback speed. With a 44KHz sample rate and stereo float data, it only adds up to 352 Kbytes/sec - even PCI can handle that. Latency is more of a problem, since the GPU likes to handle data in big chunks.

[This message has been edited by simongreen (edited 08-04-2003).]

6.1 is where it’s at these days.

I checked the link alark posted and I found something about scientific calculations using GPU. I think it’s enough to start doing something of this kind…

As a frequent reader, i have seen this question (and answers)very often.
To answer this question in a general way, one have to understand the pipelined structure of GPU’s.
Yes, the GPU can perform many operations much faster than the CPU.
The bad news:
No, you can not reach the results of these operations fast.
I.E.: a+b= faaaast. c=a+b= slooooow.
Matt from NVIDIA described the problem once (his words modified by me):
Imagine a wood upriver where trees are cut and thrown into the river for transport.
Along the river there are many factories who pickup what comes down the river, process it and send it further down the river.
At the end of the river the trees have turned into beautifull pieces of furniture.
Now, the owner of the wood wants to send the furniture back to him the same way (upriver).
You have (at least) two problems:
!You need power to drive the furniture against the river flow.
!!!In order to make shure that the items coming down the river will not damage your beautifull furniture going upstream, you have to prevent anything from going downstream.
Also, you have to make shure that all parts have arrived at the final factory in order to complete the beautifull furniture you have ordered before you request to send back to you whatever they have by now.
In Open GL terms, this means flushing the pipeline before any read_pixel or similar instructions.
It has been suggested to provide a second (upstream) pipeline just for getting back the results of the GPU’s calculations.
I consider this as an attempt to turn a clock into a time travelling machine.
A GPU is meant to transform a formal description of a picture into a visible picture.
Anything else build in a GPU will reduce it’s ability to do the only job for wich it was designed given the same amount of hardware.
The good news:
YES!!! You can use the GPU to perform many operations much faster than the CPU can and YES!!! you can access the results as fast as they are produced (many millions/s).
The trick is: not to get the results back upriver but to extend the river.
In case of audio processing, connect a loudspeaker to the video output of your graphic card, so to speak.
I am a hardware engineer specialized in designing interfaces between equipment which have not spoken to each other before.
I know you cannot connect a loudspeaker directly to a graphics card, but with my little interface boxes, you can.
The audio interface is, of course, only one example.
Another one is to connect the output of the graphics card to the input of a fast A/D convertor and then do whatever what you want with the data (again, use one of my little interface boxes).
I HEAR THEM SHOUT:
“This man must be an idiot because he do not realize that the output of a graphics card is only 3 times 8 bit and a reasonable resulution for audio is at least 16 bit per channel.”
My answer to this is:Supersampling and filtering.
On a modern graphics card, all calculations can be done at floating point accuracy.
Only the very final result is scaled and clamped to the scale and resulution of the output device (e.g. the D/A convertor).
The output rate of a modern graphics card can easily exceed 100MHz.
In case of audio, only 96kHz are required to produce a CD-quality sound.
Many methods are known to reduce the sampling rate while increasing the accuracy.
If you are in need of a very special interface hardware not yet invented, feel free to contact me.
(Not cheap, not fast, only prototype, but GOOD)
P.S. Excuse my bad english, it’s not my language.

Great post, Hofi! Thanks

In case of audio processing, connect a loudspeaker to the video output of your graphic card, so to speak.

I love this idea. It is really original. But the question is - does it really make sense? Would it be any faster to let your GPU do all the work rather than CPU? I was thinking about helping CPU but in this situation it would be “unemployed”.

Another one is to connect the output of the graphics card to the input of a fast A/D convertor and then do whatever what you want with the data (again, use one of my little interface boxes).

That’s also sounds interesting…

If you are in need of a very special interface hardware not yet invented, feel free to contact me.

I’ll remember. You have an interesting job, Hofi!

Bye

[This message has been edited by Orzech (edited 08-08-2003).]

with PCI-express the datatransfer problem will be solved in about a year anyways… then we can start abusing the gpu as a parallelprocessor to the cpu… finally

Originally posted by davepermen:
with PCI-express the datatransfer problem will be solved in about a year anyways…
The pipelining and latency issues will most likely persist.
Also note that the PCI-Express implementations proposed for graphics cards offer less raw bandwidth than AGP8x, if I’m not mistaken.

then we can start abusing the gpu as a parallelprocessor to the cpu… finally
Let’s wait and see

Try this site
http://wwwx.cs.unc.edu/~harrism/gpgpu/index.shtml

That link is out of date. The new location of GPGPU is www.gpgpu.org

Just 1 question:

Isn’t the soundcard the part that is supposed to offer programmable hardware-accelerated DSP processing??

Originally posted by cschueler:

Isn’t the soundcard the part that is supposed to offer programmable hardware-accelerated DSP processing??

I guess that’s right but it would be a much greater fun to do that through your GPU, wouldn’t it?