when will software renderers be viable?

Originally posted by gltester:
That’s only 77 pixels per troop

Don’t forget to factor in the mandatory 8 subpixel bits of precision (equiv of 256x FSAA with ATI’s crappy implementation) or more, once we finally ditch 24 bit framebuffers.

Also I think a lot of people in this thread are concentrating on only the APPEARANCE of the imagery and not the BEHAVIOR. It’s one thing to render 100,000 troops. It’s an entirely different thing to simulate the physics of them marching and fighting.

Given that accurate physics simulation can be just as computationally intensive as rendering (think: particle physics) wouldn’t you rather have the CPU doing that while the GPU concentrates on tasks it is suited for? Or perhaps you’d prefer buying a dual-proc machine, or waiting an extra 18 months?

Specialized hardware isn’t going away anytime soon.

Think of it from a memory bandwidth perspective.

The Pentium IV has 6.0 GB/s. The best graphics cards claim to get close to 30.0 GB/s. And, when you have that with your graphics card, you can use the CPU for animation, AI and physics.

Personally, I think the next big step will come in production values, rather than glitz. Making sure humanoids move like humans, not like robots from a 40s SciFi movie; deriving subtle facial expressions from context; that kind of thing. Making sure that all art is following a common style, is similarly proportioned/dense, etc.

equiv of 256x FSAA with ATI’s crappy implementation

Huh? I thought it was generally accepted that ATi had very good antialiasing. They have that shifted-grid sampling thing going on. And the gamma-correct thing.

Also I think a lot of people in this thread are concentrating on only the APPEARANCE of the imagery and not the BEHAVIOR.

That’s a really good point. It takes movie animators weeks to get a 1-minute section of film together, for realistic movement. Computers have precisely 1 minute (in approximately 33 ms or less increments) to figure out how to animate everything, collide it, etc.

Better to split up the tasks and parallize.

Personally, I think the next big step will come in production values, rather than glitz.

I would say that that time has always been here. I’ve found that games with good art look much better than games with <insert effect here>. More important than caustics and so forth is basic consistency and beauty.

gltester, I don’t disagree with you when you say eventually the CPU will be fast enough, but the fact is, the GPU can get there first and may take over the CPU’s functions.

I don’t find Doom3 satisfactory in terms of graphics. There is much room for improvement in terms of graphics. Not to mention AI and physics.

One thing is for sure. When I look at CGI, I can tell it is CGI. Even those fancy movies aren’t photorealistic, except for certain shots.

When you can have a simulated physical world, and when in it incidentally wander up to a fire engine in it grab a hose and start blasting dirt apart stone by stone and walls brick by brick meanwhile enjoying the entirely incidental rainbow effect caused by light refracting in the extraneous spray as you slip around in the mud trying to keep your footing, we’ll have gone a small way towards simulation of physical reality in sufficient detail.

I think you underestimate the point at which people will say things are good enough, it is an old mistake. Even when things look and behave real it won’t be good enough for all applications.

[This message has been edited by dorbie (edited 02-05-2004).]

Originally posted by Korval:
I thought it was generally accepted that ATi had very good antialiasing.

Off-topic for this thread, but not in my experience . Maybe it depends on your definition of “good”… compared with Quartz, or raytracing where you can shoot an arbitrary number of rays through every pixel, it totally sucks.

>> you underestimate the point at which people will say things are good enough, it is an old mistake.

Completely true. I remind me seeing screenshots and videos of the original Ultima Underworld, back in 1992, and saying to myself “Wow, there is no need to have better realtime graphics than THIS !” … I was incredibly wrong … And more than 10 years later, see this : http://www.pocket.at/pocketpc/ultimaunderworld.htm

Originally posted by Won:
[b]CPU pipes are not really comparable to GPU pipes, so your exercise in multiplication doesn’t actually mean anything. For example, the CPU would have to serialize the various OpenGL stages (think: vertex v. primitive v. fragment &c) that execute in parallel on a GPU, so the GPU has a pretty big advantage there.

And who says even Intel would be able to make a GPU run at the speeds they get their CPUs to run? And are you counting the number of pins this package would require? Power? Cooling? Cost? Yield? Lots of issues to be addressed here.

You CAN design a processor that is parallel and pipelined enough (like the chaining on the Cray-1) to perform graphics computation efficiently. Vector or stream models come to mind. I’m guessing the IBM/Sony/Toshiba Cell processor is going to support something like this; not only are the individual processor elements pipelined, but you can build a pipeline out of the processor elements.

There are various things (such as depth buffer compression) that are very useful on GPUs that still might not be easy to implement on such an architecture. Also, a big part unique to GPU design is how they tune the FIFOs/caches to the behavior of graphics processing. Unless your caches have some kind of programmable associativity or eviction policies then you’re basically SOL in that regard.

-Won[/b]

Integrating the CPU && GPU on the same die would be a huge step indeed,
but a new design for GPUs, that would allow them to run at higher
frequencies, might come soon. Ati just licensed from Intrinsity a
technology that should allow them to quadruple the frequency at which
the processing units run. The performance of a 4-pipe GPU running at
1GHz would be better that the performance of a 8-pipe GPU running at
500 MHz, due to lesser loading of the latter. So it might be possible to
see a GPU derived from the Radeon9600 (but with 256-bit memory) that runs
at 1 - 1.5GHz, performing better than the R420.

In terms of die size, the 2MB of L3 cache of the P4EE requires some 120M
transistors. The R300 uses about 110M and the NV35 about 130M. I do believe
that once the 90 nm process will mature, it will be possible to integrate
a 4-8 pipe GPU on the same die with the CPU. Having the CPU read/write
directly to the graphics memory and the GPU to the main memory would cut a
lot of the driver and AGP latencies (which are becoming increasingly important,
read the BatchBatchBatch.pdf from GDC2003). It would also mean that one could
write assembly code that runs directly on the GPU (bypassing the driver).

If you think only about the downsizes of integrating the CPU and GPU then
there’s no point to thoroughly study the issue. But if, in the next 2-3 years,
this will become feasible, and someone actually does it, then someone would
hit the jackpot!

I wrote a photon mapping raytracer shortly after reading Jensen’s original paper. It could produce very nifty effects – my favorite was having two glass spheres with the caustics from one being focussed by the other. I also had decent dispersion going although it was difficult to eliminate “speckling” in the prism’s spectrum. Anyway, to see the real power of RT you need to look at the work of Gilles Tran:
3D Images 1993-2003 .

The good news is that all the programmability being added to GPUs will help get us to limited hardware-assisted RT. We need a more mechanistic approach to RT to get there, though, since the nifty things in RT usually involve mathematical techniques (isosurfaces, etc.). That’s why I said we need an OpenRT spec, something HW manufacturers could shoot for!

For example, with a HW interval arithmetic unit(IAU), isosurfaces could be done on the GPU. Your app sends the function to a GPU compiler and the HW calls it repeatedly to do the convergence.

Originally posted by gltester:
[b]Thought maybe a concrete example would help since I don’t seem to have explained myself in a way that you can at all respect:
http://graphics.ucsd.edu/~henrik/images/metalring.jpg

This is basically a small ray-traced picture with some caustics. You and I look at this copper ring with it’s relected light pattern and think “coooooool” but for most of the public, (the mass market), they probably wouldn’t even notice if the light pattern was missing. Even without using any ray-tracing, Doom 3 looks almost as good as this picture does.[/b]

[This message has been edited by mikegi (edited 02-06-2004).]

Carmack had pointed out (I think sometime during the development of Quake3) that he finds that although he had worked on first-order visual effects in the original Doom, now he focuses on second- and third-order effects, since advances in hardware have solved first-order problems with finality. And as we’ve all experienced in watching the evolution of games over the years, the effects are getting more subtle and more detailed, allowing increased complexity, and we’re taking for granted more and more of those things.

HOWEVER…

The human visual system is perhaps the most developed and most complex of all known biological systems. It’s evolved over millions of years to be able to pick out the most subtle details of the most complex scenes. We can pat ourselves on the back and marvel at our technological advances in generating life-like images on a computer screen (believe me, I’m as excited as anyone about this stuff!), but even the most impressive screen shots of DOOM 3 have fewer than a dozen characters on screen. Gollum in LotR is a single character, and he took hours to render for a single frame. Now imagine rendering human beings with the quality and resolution of Gollum; not one, but thousands as they walk and bump into each other on the bustling streets of midtown Manhattan during rush hour, all at 16Kx9K, 60 fps. 216000 times today’s processing power? Try billions.

The “graphics innovations are almost dead” crowd has been reiterating that we’re nearing the point where hardware is running out of interesting things to do, and CPU’s will eventually catch up. The fatal flaw in that argument is that because we’ve come such a long way, advances are starting to plateau, therefore innovation and interest in further advancement will cease. True, advances are seeming to plateau, but it’s not a plateau but an elbow in the curve. What it really means is that we’re just now getting to the hard part. The easy problems have been solved, and what’s left is the long, trudge ahead that step-by-step will get us ever closer to visual realism. But that path will likely not end in our lifetimes, if ever.

I have to laugh at subjective phrases like first order effects vs second order effects when used like this. I think it’s a useful observation in some ways, however I would say that reasonably accurate shadows everywhere with correct dynamic lighting are indisputably a first order effect and Carmack has never delivered them in a published title, so to present this statement retrospectively as it applies to earlier titles shows how dated it is now.

What is defined to be first order vs second order vs third order also tends to be determined by the technology and the problems a particular individual is focused on. Depending on your approach second order effects can become first order effects and underpin or even replace your original first order effects.

[This message has been edited by dorbie (edited 02-06-2004).]

I still think you guys are looking at the problem backwards!

I keep seeing different arguments based on the fact that current processors are eons away from being powerful enough to do all of the affects we wish we could do. Now look at it from the reverse (more correct) point of view. We don’t generally need our pictures to look better than reality, so there’s our ultimate goal. If our rendered animations on-screen look as good as a photograph/movie displayed on screen, we are done.

How close are we to that limit now?

Another way of putting that is: If you could list all the possible features of an image that make an image look good, you’d have a finite list. At the top of the list, the most important things would be “perspective”, “frame rate at least 75 fps”, “24-bit color”, “reasonable-looking physics”, “reasonable lighting” etc. etc. Somewhere in the middle are “shadows” and “curved surfaces”. At the bottom end of the list, the least important to general image quality are things like “specular highlights” and “perfect lighting” and “caustics”. Now to rephrase the above question: how much of this list do we already have available?

I haven’t tried to write out this list, but I’d say, with Doom 3, our image quality is 75% as good as the ultimate image quality we can imagine. In two years, I predict we will be at 90%.

Images looking 90% as good as reality are PRETTY DAMN GOOD. Yes, to get that last 10% we will still need processing power to go up by a factor of like a million, and you guys can spend the rest of your lives maybe designing clever ways to tweak the last little bit of quality out of the available hardware power. But who will care? People will already will have PRETTY DAMN GOOD images on their screens, and they won’t want to pay very much to get that last 10% of quality that you all seem so concerned about.

Someone mentioned being able to pick up the fire hose from a truck and turn it on and spray it around everywhere seeing rainbows. Seriously how many video games is that going to be an important feature for? Your only input method is arrow keys and a mouse!!! It’s not like you can shuffle a deck of cards in-game with control over every last card. How are you going to select every feature in a game world and why would you even want to? People don’t care about that stuff very much! Even if there was some adventure/Myst-type game where you really needed to do have complete fire-hose access, is it really going to hurt sales that much if the rainbows aren’t there in the sprayed water? It’s probably a tiny little obscure feature buried somewhere in a game.

I could see being this interested in every last image details if we were talking about games using full-sensory VR gear, or even if we were talking about generating full-size Hollywood movies in real-time for some reason (maybe a rentable video-game-playing theater?), but we are talking about PC video games, on a roughly 20-inch MONITOR, with a 15-year-old keyboard/mouse design for input. We are near the limit of this hardware, folks, no matter how fast the GPU may get.

Of course my fairly subjective guesses of 75% and 90% might be somewhat off, but I’m not that far off. And I probably agree with the mass markets perceptions more than your perceptions, they are the ones ultimately funding video game development.

It’s not the end of the world. People will still need good video game programmers even after the graphics are good enough.

maxuser –
The human visual system is actually quite interesting. Biologically, it appears to be quite limited, but the visual part of your brain does quite a bit to make you see what you see.

gltester –
Your thesis seems to be based on the fact that there is such thing as “good enough” AND that we are most of the way there. Man, I thought the only double whopper I was going to have today was lunch! You are entitled to your heretic’s opinion, but it shouldn’t come as a surprise to you that lots of people have lots of reasonable objections.

You’re right about the display hardware. For example, view resolution has been basically CONSTANT throughout the life of computer graphics. But it doesn’t mean that future display technologies will require additional work, and that future consumers won’t be more sensitive to various levels of realism.

Tzupy –

Things to think about:

Do you really believe you can do an apples-apples comparison between transistors in cache (regular SRAM cells) and transistors in a GPU (all computation)?

Are you suggesting that the GPU and CPU share a single memory bus? Perhaps just for communication, like fast AGP memory? Would the CPU and GPU still have their dedicated memory busses? If so, how many pins will this package be? How much would it cost in terms of packaging and system integration? If not, what limitations would the bus contension cause?

What’s the differences between the CPU business and the GPU business? What are the margins on CPUs vs. GPUs? Does it make business sense for a CPU manufacturer to integrate GPUs? The other way around? What market would this hybrid G/CPU be?

-Won

Originally posted by gltester:

Someone mentioned being able to pick up the fire hose from a truck and turn it on and spray it around everywhere seeing rainbows. Seriously how many video games is that going to be an important feature for?

Speak for yourself. If the image on the display in front of me is not pixel-for-pixel as good of the view through the window beside me, then we are not there yet. The gold standard is reality.

And the 90% estimate that you talk about is an exageration.

Just sit in front of your window for 30 minutes and just watch. Or even better, go outside. Take a serious look at all the subtle beauty of the world.

You say we are 90% there. I say we are more than 90% away.

gltester, the firehose was an example of the complexity of the world. Pick any other suitable example, it’s not a game pitch it’s an illustration and a valid one. The point is there is immeasurable physical and visual complexity in the real world and developers will pursue that. You’re just wrong about this on so many levels.

We’re not almost there, we’re not even close. You still haven’t addressed the fact that movies are and must be filmed. There is a good reason for that. We’re not even close in offline rendering, and that’s just for prescripted non interractive stuff.

gltester. just take a simple game wich has a realistic scenario. and then try to get the best graphics possible for this realtime… and then compare it to what the real world gives.

yes, we’re possibly at 50% (or, as you say, 90%) to full realistic realtime rendering of indoor buildings with walls, and… floors… and… all that. but there is at least one whole planet of other ideas on scenarios you could play in (and then there is the whole resting space of imagination). in about all other scenarios we are not even close.

to get a good example game, take zelda - the ocarina of time.

first: you’re in the wood. go into the real wood, rebuild that small region, without even the deku-tree itself, and then try to give me a realtime application (with a huge pc network if you want), that gives me the illusion of looking natural. i’m not even talking about realistic.

second: link itself, and all the other characters. make them move, interact, and look natural. gollum as minimum requirement to look realistic.

third: places like the water castle, impossible at all to look realistic with todays gpu’s.
not to say actually simulating the water in a way it acts natural… urgh. have fun with 3d fft on high res 3d grids

same for the ice-palace. there, you need at least photonmapping and raytracing with not only rgb, to make it looks believable (there, to look good, the refraction and the prisme-light-colour-scattering effects could get used imense to make it look beautiful, magic, and natural, at the same time).

you can touch all other places in zelda. there is about no-one wich is even only close to realistic implementable today.

graphics are very far away from what is called realistic. doom3 doesn’t look any bether than zelda for me. actually worse but thats a personal opinion.

An off-topic note:

Off-topic for this thread, but not in my experience. Maybe it depends on your definition of “good”… compared with Quartz, or raytracing where you can shoot an arbitrary number of rays through every pixel, it totally sucks.

Sure, compared to raytracing which isn’t designed for real-time, yes. But, for a triangle rasterizer, ATi’s R300 anti-aliasing is the best around.

Now, back on-topic.

but a new design for GPUs, that would allow them to run at higher frequencies, might come soon. Ati just licensed from Intrinsity a technology that should allow them to quadruple the frequency at which the processing units run. The performance of a 4-pipe GPU running at 1GHz would be better that the performance of a 8-pipe GPU running at 500 MHz, due to lesser loading of the latter.

While it is true that fewer pipes at a higher clock speed is better, you have to realize that

Having the CPU read/write
directly to the graphics memory and the GPU to the main memory would cut a
lot of the driver and AGP latencies (which are becoming increasingly important,
read the BatchBatchBatch.pdf from GDC2003).

First, this .pdf is directed at D3D only. Secondly, the reason it is directed at D3D only is because the IHV-portion of D3D isn’t allowed to do marshaling of calls itself; the D3D runtime does it (and not in a particularly thoughtful way). This has nothing to do with bandwidth between the card and the CPU.

It would also mean that one could
write assembly code that runs directly on the GPU (bypassing the driver).

Neither OpenGL nor D3D are ever going to provide an API for that. Nor should they.

That’s why I said we need an OpenRT spec, something HW manufacturers could shoot for!

Why would they want to? Remember, even high-end movie FX houses don’t use RayTracing that frequently. So why should they provide that ability in consumer-level cards? If it’s good enough for Pixar, isn’t it good enough for everybody else?

The “graphics innovations are almost dead” crowd has been reiterating that we’re nearing the point where hardware is running out of interesting things to do, and CPU’s will eventually catch up. The fatal flaw in that argument is that because we’ve come such a long way, advances are starting to plateau, therefore innovation and interest in further advancement will cease. True, advances are seeming to plateau, but it’s not a plateau but an elbow in the curve. What it really means is that we’re just now getting to the hard part. The easy problems have been solved, and what’s left is the long, trudge ahead that step-by-step will get us ever closer to visual realism. But that path will likely not end in our lifetimes, if ever.

In general, my belief is that the only differences between various cards in 3-5 years will be performance. And that will be the only impetus to upgrade as well. Graphics cards will be feature-complete.

I haven’t tried to write out this list, but I’d say, with Doom 3, our image quality is 75% as good as the ultimate image quality we can imagine.

Lol!

We aren’t even 25% of the way there. What we have done is, as someone said, “picked the low-hanging fruits.” We’ve done the easy stuff. Texturing to add detail. Surface detail interacting with lights (bump mapping). Reasonably correct shadows. Now, comes all the really hard, but really important, stuff.

Take shadows for instance. Either shadow maps or shadow volumes, neither one provides for easy soft shadows. But soft shadows are vital for photorealism. Doing soft shadows is hard. We did the easy part: hard shadows. Now, comes the difficult, yet vital, part.

These kinds of subtle things are what separates “that’s pretty decent CGI” from “that’s CGI?!” Without these subtle interplays of light, you aren’t getting the job done.

Besides what is good enough being in the eye of the beholder, the future is uncertain. Someone might decide that 1k x 1k or even 4k x 4k is not enough and they want to project real 3D images in mid space.
How advanced are those now? Not very it seems.

This discussion is very narrow minded I think.

Let me give an example. IBM was researching a kind of display that would host billions of pixels (microscropic pixels) in an effort to duplicate the paper.

<sarcastic time>
we’re gone need CPU’s that can execute 40 trillion instructions per nanoseconds someday!
</sarcastic time>

if you are talking about real time photorealism and software rendering, let’s have a look at the screenshots here http://www.openrt.de/Gallery/IGI2/.

It is based on a distributed ray tracer (up to 24 dual Atlhon MP :-)). They obtain 2fps on a very complex scene (50M of triangles!) with global illumination.
It is very interesting but it is still so far from reality…

What a stupid thread.

Custom hardware will always be better in a specialised task, therefore there will always be gpu’s for graphics. Regardless of whether its scanlining or raytracing.

A P4 might be able to come close to certain operations quickly. I recall the very same comparison of Vertex Shaders on P4 and GPU. The point is, the P4 was probably at 100% usuage to match the gpu, of which theres no time left for Sound, AI, collision, input handling, simulation, networking, etc…