when will software renderers be viable?

How soon will general purpose CPUs be fast enough at doing 3d math, that we wont require graphics cards anymore? This will be great because we will be able to implement software renderers and have absoloute control over what we want to do - we wont be limited by the gfx card’s capabilites or bad driver support for example.

Im sorry this isnt specifically an OpenGL question, but it is an interesting one - dont flame me for being such a visionary

I’d say before 1996 or so. At that point, the release of the upcomming 3Dfx Voodoo will obliterate any form of CPU-based rendering in terms of speed (at comparable technology levels).

From then on, GPUs are poised to increase in rendering speed faster than CPUs, meaning they will likely never catch up(*). GPUs are also starting to become more programmable with time. Notice the inclusion of basic combiners in the upcomming TNT2, which will revolutionize graphics. This is only the first step: future products will increase programmability until eventually whole programming languages specifically geared towards graphics will need to be introduced.

(*) At least until “Good Enough” is reached, but that’s not likely to happen any time soon.

[This message has been edited by al_bob (edited 02-04-2004).]

Not for the forseable future. I read an article very recently which said, GPUs are capable of calculating at over 300x the rate of the cpu in certain special cases. Add of course, the need for high speed memory w/ low latency(rare for the cpu). I simply do not see it occurring.

Answer: never … at least for comparable generations of hardware. Way back in the early 1990s, I had a similar discussion with coworkers. We did drivers for PC graphics hardware. Many thought that the Pentium - and certainly the Pentium II - when combined with the new PCI bus would eliminate the need for dedicated graphics chips.

My reply was that graphics chip companies very lives depended on making their systems faster than a CPU drawing on a dib. We were mainly concerned with 2d rendering. With 3d graphics the GPU utterly blows away a general purpose CPU and will for as far as I can see. There are too many advantages to a dedicated 3d graphics subsystem: memory access, parallel processing, targeted caches, etc.

If you want to be a “visionary”, come up with a good, hardware friendly OpenRT (Open RayTracer) spec. Something that has most of the flexibility of a purely CPU implementation but can be accelerated with hardware.

Originally posted by imr1984:
[b]How soon will general purpose CPUs be fast enough at doing 3d math, that we wont require graphics cards anymore? This will be great because we will be able to implement software renderers and have absoloute control over what we want to do - we wont be limited by the gfx card’s capabilites or bad driver support for example.

Im sorry this isnt specifically an OpenGL question, but it is an interesting one - dont flame me for being such a visionary [/b]

I’ll just add as evidence this page: http://www.gpgpu.org/ which is all about General-Purpose computation on GPUs. So yea, the for the time being, the trend is moving computation onto the GPU, not off of it. Basically, the GPU allows for paralellism that CPUs don’t. For example, you may have a SIMD processor on your desktop, but it’s not your CPU.

There’s 35 Gflops/s* in a Radeon 9700Pro (at its 315MHz default clock speed), and that’s only counting shaders. There’s yet more processing power in the rasterizer, the various interpolators, blend units etc.

I don’t see general purpose processors anywhere near that. What’s the peak throughput of a P4 at 3.2GHz? 6.4 Gflops/s?

vec4 counts as four operands, obviously => 843 per clock for the fragment shader (MAD counts as two, plus a MUL in the mini-ALU); 44 for the vertex shader

gpus are bether only in rastericers. it doesn’t mather how fast gpus are, they will never match raytracer on cpu in eighter quality or speed. raytracers are the future. cpus also have much bether precicion, which is very serious mather in raytracers

daveperman, GPUs can use full 32-bit floats, can they not? Granted mapping the raytracing problem into the graphics card domain isn’t trivial, but there is an enormous ammount of paralellism available on a GPU that you don’t have on the CPU. Just for starters, it would be easy to inintialize the rays for a raytracer on the GPU. But if you think raytracing is needed for good graphics, talk to Pixar. I don’t believe RenderMan is a raytracer.

I’d also say never, at least as long as we’ll have a separate dedicated and highly specialized graphics processor. You can always say “today” that in 5 years, CPUs will be fast enough to do what we can do now on our graphics card. But in the mean time, the standard for graphics will have evolved too. It’s possible on today’s CPUs to achieve Vaudoo2/TNT’s level of graphics quality/performance, but who’d like to play a game with Quake 1 / 2’s look when Far Cry, Half Life 2 and Doom 3 are around the corner ?


This is a pointless way of looking at the problem, the question never changes and has been asked for years. In some ways software has been viable for years (maybe even more viable in the past when considered relatively), in other ways software never will be viable.

The first 3D engines on PCs were software only and were viable for Doom & Quake style graphics.

Dedicated hardware will always outperform general purpose hardware, so software only will always be at a disadvantage compared to the GPU. For some this means it isn’t viable because what they mean by viable is deterimined by currently available best hardware capabilities. GPUs outperform CPUs even with SIMD instructions in the CPU and very programmable GPUs, that’s not going to change because the hardware is designed with different priorities.

Software today can outperform some of the hardware of yesterday, does that mean yesterdays hardware wasn’t viable? No it means for some that software will never be viable because determining viability requires a definition and that definition changes with graphics hardware and the evolving graphics applications it enables.

So, it’s a pointless question.

I think the interesting question is:
“When will the CPU’s 3D graphics be so good that a dedicated card has nothing extra of importance to offer us?”

My feeling is that this will occur within a few years. Have you seen the Doom 3 screenshots?

In a couple years a CPU will be able to do that without a GPU. And not long after that, it will be able to do photorealistic rendering. And then the cost of a dedicated GPU will no longer be justified for most people.

The CPU is a general-purpose processor. It is able to do everything.
However it is designed to be able to do everythng, not to do a specific thing. Therefore it will ALWAYS be slower than something which is designed to do a specific thing.
This is not only the case in computing, but in everything. Evolution is the best representative. The human being is able to do everything (which other animals are capable of), but usually it takes a LOT more of energy, brainpower etc. to accomplish it.

Therefore there will always be special technologies to do the same thing with less power and usually for less money. A CPU which is fast enough to do the same what a todays combination of a CPU and a gfx-card is capable of would cost a lot more and that will never change.

However gfx-cards might change a lot. Maybe we will have chips able to do raytracing in a few years, we´ll see. That´s a big step, but it is a logic one.

And take a look at modern PCs. There is not only the CPU and the GPU. There is the CPU, the GPU, the soundcard (with its own, SPECIALIZED processor), there is the network-card (with its own SPECIALIZED processor/chip), there is a north- and a south-bridge, etc. etc. etc.

EVERYTHING in a PC is specialized. Even the CPU is specialized on being a general purpose CPU and FPU.

Having ONE processor which handles everything just doesn´t make sense. Having a specialized gfx-card which is fully programmable makes sense. It is faster and cheaper.


> Having ONE processor which handles
> everything just doesn´t make sense.

Sure it does. We got by in single processor mode for decades. We did have the FPU on a seperate chip for awhile, but look where that ended up!

The GPU will need it’s own chip only as long as there is still active research & development going on in the area of 3D graphics.

Once everything settles down and people are satisfied with the level of graphics quality, Intel and AMD will be dying to move the GPU onto the CPU, too.

gltester, you should read some of the posts in this thread (again?). When CPUs can do what Doom3 does GPUs will do more than that and Doom3 GFX won’t be enough. Extensions like MMX SSE and SSE2 have tried to close the gap a bit but it just can’t get you there, there are very domain specific pipelined hardware optimizations dedicated to graphics performance on a GPU.

Perfectly obvious and the same argument that’s been doing the rounds for years. It’s a bit like saying you’ll never need more than 64k. Equally shortsighted and assuredly wrong, it won’t happen anytime soon.

Interestingly people are also asking when GPUs will become programmable enough that they replace the CPU for a lot of compute intensive stuff and put Intel out of business.

About this point in the discussion my head hurts as I picture two snakes swallowing each other tail first.

Originally posted by gltester:
Sure it does. We got by in single processor mode for decades.

Um. . . yeah, but if you compare all those decades to the past 5 years, you’ll have to note that there’s a freaking massive difference in what we’re expecting computers to do.

[This message has been edited by Ostsol (edited 02-04-2004).]

Just curious. How fast can a P4 3.0 GHz (or equivalent AMD, G5, …)
render a single texture, non-lit cube in software mode?

Can it do a minimum of 60 FPS?

I’m thinking of MESA here, but I guess MESA has little optimization.

I too see the GPU taking over the CPUs functions, but I think the current PC design will last a very long time.

Latest CPUs still come quite short of a TNT2 in terms of rendering speed, especially when you throw in some texture filtering or multitexturing (as far as software DirectX or MESA allows us to judge it).

And given what we have seen during the last year or so, CPUs may well have left Moore’s Law curve, I reckon the gap between CPU & GPUs for 3D rendering will only get worse as GPUs are moving up faster in terms of clock speed, parallelism and features.

I am in the process of writing a software rasterizer keeping it as close as possible to the OpenGL API to provide easy recompilation of apps so I have a little bit of experience with this topic. I sincerely don’t think that software rendering will ever be able to match the speed of dedicated hardware simply because dedicated hardware is meant for that purpose and nothing else. Even if we had 256-bit wide buses attached to high speed memory won’t put our CPUs on the same level as GPUs, their design is fundamentaly different as is their purpose. But software rendering has its advantages and is vastly superior on some fronts. Scalability is one of them, carefully optimized software rasterizers are not limited by the amount of data you push on them. I can easily increase the number of polygons I send to my rasterizer by a factor of ten and incur only in a minor performance hit (mainly due to the polygon setup routines).

The x86 CPUs with 2-3 pipes are obviously not a challenge to the 8 pipe GPUs. But an interesting approach is to integrate the GPU together with the CPU (it’s already happening with the chipsets, but at low speed and with unified memory, so performance is no great).
The advantage of the CPU versus GPU is raw clock speed:
3.4 GHZ versus 500 MHz.
What the CPU lacks is not so much instructions per second (a fine assembly program can do marvels), as memory bandwidth:
6.4 GB/s versus 32 GB/s.
Instead of increasing the L2 cache size of the CPU (or adding a 2MB L3 cache), it would be possible to integrate a 2-4 pipes programmable GPU, for the same die size.
Now, 2 pipes running at 3.4 GHz (or 4 pipes running at 1.7 GHz) would be faster than 8 pipes running at 500 MHz, IF the integrated CPU + GPU would have a 256-bit graphics memory interface, besides the 128-bit main memory one. Of course, a new motherboard design would be required to make it work (there would be a lot more pins and connections).

CPU pipes are not really comparable to GPU pipes, so your exercise in multiplication doesn’t actually mean anything. For example, the CPU would have to serialize the various OpenGL stages (think: vertex v. primitive v. fragment &c) that execute in parallel on a GPU, so the GPU has a pretty big advantage there.

And who says even Intel would be able to make a GPU run at the speeds they get their CPUs to run? And are you counting the number of pins this package would require? Power? Cooling? Cost? Yield? Lots of issues to be addressed here.

You CAN design a processor that is parallel and pipelined enough (like the chaining on the Cray-1) to perform graphics computation efficiently. Vector or stream models come to mind. I’m guessing the IBM/Sony/Toshiba Cell processor is going to support something like this; not only are the individual processor elements pipelined, but you can build a pipeline out of the processor elements.

There are various things (such as depth buffer compression) that are very useful on GPUs that still might not be easy to implement on such an architecture. Also, a big part unique to GPU design is how they tune the FIFOs/caches to the behavior of graphics processing. Unless your caches have some kind of programmable associativity or eviction policies then you’re basically SOL in that regard.