General purpose programs executed on GPU

Just an idea, and it’s propably up to hardware more than OpenGL, but since the hardware already can execute all kinds of vertex and fragment programs, why not simple general purpose programs as well? Of course with limited instruction set and size, but still…

The point is, that several important algorithms require reading back frame/depth/stencil buffer. Such algorithms include, for example, selecting resolution for adaptive shadow maps and Brabec’s single sample soft shadows. The readback takes a long time making even the simplest algorithms requiring it unsuitable for realtime. But if those programs could be executed on the gpu, they’d have direct access to the framebuffer, so only a minimal amount of data would be moved across the bus.

Comments?

-Ilkka

I really like the idea of puting your own
programs on the GPU.

Its the ultimate OpenGL dream for me,
if you could make your own software-rasterizor
that used the hardware-power of the
graphics-card without writing a specific
driver for each card!

Ok, so the Brabec’s shadows can be done in an advanced fragment shader. But I’ve got a new one, pre-warping for relief textures. It’s really a very simple algorithm, but the texture upload time takes away the advantage. Doing it on the gpu would solve that problem.

-Ilkka

Put it in a fragment program. Rather than doing the texture access, just dynamically compute the texture color at the given texture coordinate.

You mean the relief textures? But if i’ve understood correctly doing them in a mere fragment program would involve a search algorithm for each fragment in order to get the parallax right. I think pre-warp technique was created to get rid of this burden, and would work substantially faster than the inverse warp method if only you could get rid of the texture upload time.

-Ilkka

A general rule of thumb that complicates mapping general-purpose computations onto the GPU:

  1. Transistors are cheap
  2. Communication is expensive

(Kurt Akeley’s words, not mine.)

Any time you require explicit communication from one part of the rendering path to the other (with reading back the framebuffer across the AGP bus being a particularly nasty case), performance is lost.

A general-purpose computation can be made efficient only if it explicitly takes advantage of the parallelism provided by the GPU. One example is Mark Harris’s physically-based visual simulation techniques on the GPU (look for his home page at UNC).

Eric

I guess, a VPU(like Wildcat VP) should suport this feature.

Originally posted by JustHanging:
Just an idea, and it’s propably up to hardware more than OpenGL, but since the hardware already can execute all kinds of vertex and fragment programs, why not simple general purpose programs as well?

See
http://www.cs.washington.edu/homes/oskin/thompson-micro2002.pdf

for an interesting step into this direction.

(Thompson/Hahn/Oskin, Using Modern Graphics Architectures for General-Purpose Computing: A Framework and Analysis, 2002)