HW Raytracing with OpenGL?

Hello coders…

I was just wondering if anyone has an idea how complicated can be to implement raytracing algorithms as hardware accelerated? Perhaps this should be something like OpenGLR or OGL3 standard in the future?

Since GL2.0 isn’t even finished yet, I think talking about 3.0 is a bit prematuer. Also, ray tracing is not very easy to implement in hardware.

Is it time for the monthly “OpenGL should support raytracing” suggestion already? My, how time flies.

I am not so sure it is premature. Technology is advancing so fast, that serious considerations of a hardware raytracer should be taken into questions. First of all, we already simulate certain aspects of raytracing while producing shadows, and while having projective texturemapping.

Few years ago I was trying to make a fake raytracing engine for a doom-like game, and it worked fine for X and Z. Now if it would be hardware accelerated, we could introduce the Y and speed up the process.

I mean, it doesn’t have to be high-tech raytracer like professional 3D applications are using, but something substantially sofisticated to be applied for games.

Originally posted by Korval:
Since GL2.0 isn’t even finished yet, I think talking about 3.0 is a bit prematuer. Also, ray tracing is not very easy to implement in hardware.

The reason that scan conversion is the method of choice for hardware is that it is simple to implement efficiently. Ray tracing can be implemented in hardware, but the simple way of doing so doesn’t really result in any real speed-up compared to software. Trying to make ray tracing efficient in hardware is the problem, and it hasn’t been solved yet.

Oh, and, since you’ve actually done some real ray tracing, you’ll know that one of its primary advantages is the ability to use any arbiturary primitive, not just triangles. Any reasonable hardware ray tracing solution has to have programmable intersection and primitives, which doesn’t help efficiency.

Oh, but I wouldn’t agree.

Take tangential calculations for example: it has to be computed for each texel, then derivative taken to check on reflection, etc… So, the very same mechanism that speeds up triangle rendering in HW will be able to speed up tangential calculations as well.

Also, the primitives can remain the same as in standard OpenGL. I mean, you have to describe the world to the renderer as simple as possible. Primitives get their spatial coordinates recalculated, projection adjusted, and then TRACE-BACK to viewpoint to define angle/radius, and then TRACE-FORWARD toward the light points or other objects.
Primitives can be designated as shiny, reflective, etc…, as a binary switch with amount treshold so the renderer will take only those primitives in account when tracing.

Maximum number of reflections, refractions can be set up as glSetRayRate (refl, refr);

But, I am suggesting a ray tracer that is not quite like those big professional raytracers, but rather a simplified version, that will easily get HW implementation, and that will improve realism with shadows, lights, transparencies, reflections, etc…

What if the user decides to send a million triangles to be rendered in one scene. What if they decide to send a billion? Or a trillion?

Because ray tracing requires the entire scene to be defined before beginning the rendering process, a hardware implementation would simply fail at some point because it has a finite amount of memory to store everything in. So then limits must be set as to how complex a scene could be. These limits would vary by graphics card, and there isn’t any way to get around them.

So imagine that you are playing a game on your RayForce7, when your buddy walks around the corner. All of the sudden, the game stops and politely tells you that it’s out of video card memory because there are too many triangles in the scene, then quits.

OpenGL is at heart an immediate mode API (in other words, rendering of a scene is not dependent on having the entire scene in memory at once), and it should stay as such. Ray tracing is a completely different beast to implement, thus the interface would be very different. If there ever is an API for hardware ray tracing, it shouldn’t and hopefully wouldn’t be a version of OpenGL.

j

Raytracing is very overrated. Most if not all Cg movies (Final fantasy, shrek, toy story, Monsters Inc) are not raytraced. Before v11(which came out this month) the Pixar PRMan renderer (used in most of these movies) did not support raytracing and global illumination. There are two cases where you need raytracing - global illumination and realistic reflections and refractions. GI can be faked by placing lights around the scene. Reflections/refractions can be faked with environment maps. Not perfect, but I doubt these two features can justify the much higher complexity and the huge speed hit of a raytracing solution. There are some promising implementations of GI using spherical harmonics that work on current hardware.

Take tangential calculations for example: it has to be computed for each texel, then derivative taken to check on reflection, etc… So, the very same mechanism that speeds up triangle rendering in HW will be able to speed up tangential calculations as well.

Tangential calculations? What are those?

When I say that hardware ray tracing is not easy to implmenent efficiently in hardware, I’m not speaking from just thinking about the problem myself. I’m speaking from reading about various attempts at hardware ray tracers and reading about the relative inefficiencies in them. So far, no one has found a way to efficiently implement hardware ray tracing.

Also, the primitives can remain the same as in standard OpenGL.

And therefore defeats one of the primary advantages of a ray tracer.

Primitives can be designated as shiny, reflective, etc…, as a binary switch with amount treshold so the renderer will take only those primitives in account when tracing.

That’s not much of a ray tracer, if only the marked objects are going to be reflective/refractive.

that will improve realism with shadows, lights, transparencies, reflections

The only way to really improve those over what you can get with current graphics cards is to use “big professional raytracers”. Your hack ray tracer isn’t going to get the job done; all it will do is provide the same image a scan converter could do with lower speed.

GI can be faked by placing lights around the scene. Reflections/refractions can be faked with environment maps. Not perfect, but I doubt these two features can justify the much higher complexity and the huge speed hit of a raytracing solution.

If you’re looking for perfect photorealism, then they do justify the greater complexity and time cost. Scan conversion is just a progressive series of hacks; successive approximation applied to photorealism. It is never going to be “right”, only merely “close enough”.

There was a paper from Stanford that dealt with raytracing in programmable hardware (read NV30) located here: http://graphics.stanford.edu/papers/rtongfx/

I was up for an hour last night with an idea for a rendering workstation. Current limitations in chipset design only allow 1 AGP card per computer, but buy designing a new workstation around the AMD Opteron server chip we can use up to 18 AGP cards with current hardware. I choose the Opteron (4-8way) instead of the Athlon 64 (1-2way) chip (used to be Clawhammer) because of several reasons.

  1. Double the memory bandwidth
  2. Double the HyperTransport bandwidth

All of it’s I/O are handled by 3 HyperTransport links, each at up 6.4 GB/s. Without using any custom chips we could install 3 AMD-8151™ HyperTransport™ AGP3.0 Graphics Tunnel chips (1 to each channel). Each 8151 chip controls one agp card. The only reason that I only used one 8151 chip per channel is that the upstream connection (for the 8151) is at 16 bits (6.4 GB/s) and the downstream connection is only 8 bits (1.6 GB/s). Now, we could use a second 8151 per channel with slight loss of performance (for a total of 6 AGP cards). Throw in a HyperTransport switch (breaks up a 32 bit HyperTransport channel to multiple smaller channels) and this can blow way past 18 cards. All this (except the motherboard design itself) can be built with hardware available in the 1st half of 2003.

Understand that HyperTransport channels are arranged in a daisy-chain fashion and for my idea are not limited to length (actual limit is 31 devices). One of the channels would have an AMD-8111™ HyperTransport™ I/O Hub on the end to handle all I/O functions.

Now if someone like SGI would design/build a new HyperTransport™ AGP3.0 Graphics Tunnel that would open up the up the upstream and downstream limits to the max value allowed by the Opteron chip which is 32 BITS!!!

I realize that this is over simplified and there are issues to deal with in software, but the hardware is in place.

Eric Teague
Spudmanwp@yahoo.com

[This message has been edited by Spudman (edited 12-04-2002).]

Ray tracing strikes me as radically better than the array of tricks you are forced to play (environmental maps, cube maps, shadow maps, stencil shadowed approaches etc.) with todays hardware. Clearly ray tracing is not possible with today’s hardware, but when it is it will dominate.

Also, all the animated titles below would look better had they been ray traced. Just because Pixar hadn’t used ray tracing, doesn’t mean that it isn’t better. It’s hardly overrated.

Originally posted by GeLeTo:
Raytracing is very overrated. Most if not all Cg movies (Final fantasy, shrek, toy story, Monsters Inc) are not raytraced. Before v11(which came out this month) the Pixar PRMan renderer (used in most of these movies) did not support raytracing and global illumination. There are two cases where you need raytracing - global illumination and realistic reflections and refractions. GI can be faked by placing lights around the scene. Reflections/refractions can be faked with environment maps. Not perfect, but I doubt these two features can justify the much higher complexity and the huge speed hit of a raytracing solution. There are some promising implementations of GI using spherical harmonics that work on current hardware.

Ray tracing strikes me as radically better than the array of tricks you are forced to play (environmental maps, cube maps, shadow maps, stencil shadowed approaches etc.) with todays hardware. Clearly ray tracing is not possible with today’s hardware, but when it is it will dominate.

As I’ve pointed out before, ray tracing is not conducive to a fast hardware implementation. Ray tracing is, fundamentally, an asynchronous process. Even multiprocessed hardare implementations of ray tracers have either been fairly bereft of features or not much faster than modern processors (and still, in most cases, lacking a ray tracer’s versatility).

Guess you’ve all seen this?
http://www.saarcor.de/

Originally posted by nutball:

Guess you’ve all seen this?
http://www.saarcor.de/

Quite interesting. But there are a few glitches:

  1. You need an axis-aligned BSP of the whole scene. When something changes you have to rebuld the affected BSP nodes. This is very slow (especialy when the changed geometry spans a root node) and can not be done efficiently in hardware.

  2. The architecture requires good ray coherence - e.g. lots of rays span the same BSP nodes and intersect the same triangles (hence the advertized low bandwidth requirements). Not good if you have lots of pixel sized tiangles - will happen a lot when games start to use HOS+displacement mapping or just more dence geometry. Also not good for things like reflections/refractions from bumpy surfaces.

  3. I don’t see how this architecture can be less complex than the classic hardware implementation - they replace the very simple triangle rasterization+depth compare unit with a bunch of paralel raytracing units that perform BSP traversal and triangle intersections.

This approach has a few advantages - the capability to calculate reflections/refraction and shadows, and the zero overdraw when calculating the shading. But because of glitches #1 and #2 it is unsuitable for rendering anything other than static and relatively(by Y2003 standarts) simple scenes.

All other disadvantages of the standart raytracing still apply - lots of context switching, needs the whole scene in memory at once, etc.

[This message has been edited by GeLeTo (edited 02-06-2003).]

Instead of using triangles, they should have gone with some from of spline patch (best-case: NURBS). That way, it’s a naturally curved surface that automatically has lots of spatial locality, and, therefore, better cache performance. Unless you have lots of wrinkles (which is what bump maps are for), you can model highly complex characters in far fewer spline patches.

[This message has been edited by Korval (edited 02-06-2003).]