GeForce FX vs Radeon 9700 or higher

the image quality is actually bether in the r300, and 24bits for pixelshader is more than enough. shrek is done with 16bit floats (according to nvidia).

check www.anandtech.com

and i think the card is not really worth the money, for being not much faster after 6 months (nvidia claims that all 6months, hw gets twice as fast…), for being so much more advanced technique to be fast (ddr2,.13), still this new features only help to catch up and a tiny bit beyond what ati presented 6 months ago. and all just with extra overclocking and ultra strong cooling. that thing boosts 60°C out in the back, and has 140°C on the top, some centimeters away from the processor. and 70db is not very quiet, too… (well, there is a statement of a 7db version currently. we’ll see).
and power consumation is enormeous, too…

i hope nvidia can get much more of this technically very very powerful features. ati currently works at using the same technological power, to speed up their, well… “old” chip, wich looks like it is algorithmically far superiour…

anyways, i’m open for some funny duels during this year… at least all of these gpu’s are based on full dx9 support features, wich is a great base… (i’ve heard rumors nv31 and nv34 don’t support full dx9 in hw… that sounds crappy… i hope its not true).

we’ll see…

but currently, a radeon9500 with 128mb ram is best. mod to a radeon9700pro and you’re done

Well, I’m looking to buy a videocard but it doesn’t have to be right now. I was actually aiming at sometime around christmas. I read somewhere that ATI is attempting to release the R400 in the fall. I don’t know (or care) how fast the R400 is over the R350 or R300, but it should have support for DirectX9’s ps/vs 3.0. I’ll probably end up getting to that level of programming by christmas anyways.

I’ve also heard nothing about the nv31 and n34. After reading a few more articles i found one line saying that they were the real focus of Nvidia and that the NV30 is just an intermediate ‘keep up with the competition’ card.

  • Halcyon

Of course the FX is a better and faster card than the current (current) 9700 PRO. Just look at the specs of the two…

If you want a card for development, choose the FX. It offers more features than any other card available (it surpasses DX9 and only GL exposes the entire hardware). The reason for 32 bits color is simple: more precison. But it isn’t clamped to [0,1]. The only place where you can have a 24 bit float buffer without loosing precision is in the z-buffer because it is meant to work that way, only between [0,1]. That’s why you can have a 24 bit z-buffer. During a fragment program
the values can go beyond 1 for the next calculation. If the values were always
clamped, instead of using floating-point you could use integer math because you would only worry with the fractional part. The FX offers 32 bit full precision math, instead of the 24 offered by the 9700. I don’t know if it useless or not.

As for speed… The card is a monster. I wouldn’t like to have to write drivers for it. It’s complexity is enormous. Driver’s can’t be mature yet, and it will be a long time untill they do. But look at some tests made with it. For example, in the high polygon count in 3DMark2001, the FX almost doubles the performance compared to the 9700. Doubles?! How can the card be so good at something and “suck” in others?! If you look at the test, is is a simply one: throw an huge amont of triangles with some hardware lights. The test is simple, so it maybe that the drivers are working fine in one part, and the other is being worked out. If not how can you explain the huge diference in values. And another example: Carmack says the card is faster when using it’s features (NV30 code path) rather than the ARB2 code path. Why? The ARB2 features are less powerfull than the NV30 features, so if it works faster in NV30 code path it must work faaster in the ARB2 code path. Is
just drivers…

[speculation]
And finnaly, I believe this card is very suited for GL2. I believe NVidia looked
a lot at the GL2 specs when building this card. I don’t mean it is fully in accordance
to the current specs, but they’re close. I think it was matt that said, in a post here
in the beginnig of the year, something like: start loving GL2…
[/speculation]

How, never wrote such a huge post before…
How is my english?!

Originally posted by KRONOS:
Of course the FX is a better and faster card than the current (current) 9700 PRO. Just look at the specs of the two…

specs don’t make a card fast. a p4 was always faster than any athlon when looking at the specs. only tests showed the reality. at the same spec-speed, p4 was much slower. today, they win because of raw speed.
www.tomshardware.com www.anandtech.com

then you know how much faster the FX actually is.

the 8bit more in the floatingpoint unit is surely not useless, but not proven to be useful eighter. its like using doubles in your code instead of floats… cinematic movies are done with 16bit floats according to nvidia so it at least looks like 32bit floats are quite useless.

about the speed. its not a monster. its not much more advanced than the r300. more pixelshader instructions don’t make a chip more advanced. it has some new features, yes, but they should not make it such a monster in handling.

its not just drivers. its even not much faster in some tests, where it is yet at its own technical max.

[speculation]
at least ati works at compilers to compile gl2 slang to arb_fragment and arb_vertex program o today.
[/speculation]

your english is nice.

LOL! The NV30 is faster using the NV30 path because that way it isn’t necessarily forced to use its 32 bit precision all the time. The developer can decide whether or not to use 32 bit precision for an instruction, or only use 16 bit.

I also heard that the NV30 has a dedicated T&L unit, rather than emulating T&L via vertex shaders. For that reason, it performs much closer to what it should in 3dMark’s high polygon tests. In contrast, the R300 does not have a dedicated T&L unit, so performance in that test is comparatively worse.

specs don’t make a card fast. a p4 was always faster than any athlon when looking at the specs. only tests showed the reality. at the same spec-speed, p4 was much slower. today, they win because of raw speed.

Of course they do. If it isn’t faster on paper, it wont’ be faster on the chip either. But being faster on paper doesn’t mean full speed on chip. The silicon must be good…

cinematic movies are done with 16bit floats according to nvidia so it at least looks like 32bit floats are quite useless.

So this means that they (NVidia) know that it is useless and did this on purpose so the card would be more expensive?! What more useless things did they put in the card?

about the speed. its not a monster. its not much more advanced than the r300. more pixelshader instructions don’t make a chip more advanced. it has some new features, yes, but they should not make it such a monster in handling.

But when the R300 was out is was worse in drivers than the FX, it almost took 6 months for them to have a good set of drivers. And that had an huge impact on performance. The drivers are as important as the hardware. If the drivers aren’t fast, the card can’t be fast.

LOL! The NV30 is faster using the NV30 path because that way it isn’t necessarily forced to use its 32 bit precision all the time. The developer can decide whether or not to use 32 bit precision for an instruction, or only use 16 bit.

I guess the test was with the same precision. That way it wouldn’t be possible to compare…

Originally posted by KRONOS:
I guess the test was with the same precision. That way it wouldn’t be possible to compare…

It wasn’t. The R300 uses 24 bit precision only, for fragment programs. The NV30 supports 12 bit (fixed point), 16 bit (floating point), and 32 bit (floating point), but not 24 bit. Thus, it is impossible for the NV30 and R300 to be compared to each other using the same precision.

Ostsol: you’re right… Maybe the FX used 32 bit precision all the time?

[This message has been edited by KRONOS (edited 02-04-2003).]

Don’t get religous about it guys – they’re just video cards.

Get a Radeon 9500 or 9700 because:

  • It’s out now, and your patience is running out
  • It’s cheaper than the GeForceFX
  • It has a more elegant cooling solution
  • It will make Davepermen like you

Get a GeForceFX because:

  • It has some extra flexibility in the shaders (precision, flow control, instruction count)
  • NVIDIA has better OpenGL drivers than ATI
  • You can get the Gainward card which will allegedly have a quiet cooler
  • You can boast that your video card is more expensive than Davepermen’s

How’s that for a comparison?

– Tom

P.S.: No offense, Dave

Originally posted by KRONOS:
Maybe the FX used 32 bit precision all the time?

For Doom 3’s ARB2 path, that is confirmed by Carmack. There is a precision “hint” provided in ARB_fragment_program that can allow for a lower precision to be used (if the card supports it), but it is applied globaly to the fragment program, rather than per instruction, as NV_fragment_program does. I’m guessing that that way the NV30 would certainly be at least as fast as the R300, using that path.

I thought that the hint was to allow a higher precision. I thought the card was set to 16bit as a default and if you wanted you could bump it up to 32bit using a shader program. I probably got it backwards.

I think dave has a point. I mean 24bit is probably enough. The low precision picture in the “The Dawn of Cinematic Computing” article by NVIDIA could either be 16bit or 12bit. Since NVIDIA is generally an honest company, we’ll assume 16bit. Sure the difference between a 16bit and a 32bit fragment shaders is clearly visible in each of the pictures. But I don’t see any comparisons between the 24bit and the 32bit. There is no way to tell how much more useful the extra byte is. I’m sure it makes a difference, but how much more of an difference?

I also don’t want a video card taking up two expansion slots. And 70db is a bit much for a cooling fan inside a computer. However, I read somewhere (might even me in this thread) that NVIDIA might be making a 7db fan for the FX later. It is probably just a rumor, but if they did that, the FX would be a very very nice card.

I’m leaning towards the R300 right now, but I don’t want to rush a purchase right away. I mean there isn’t a lot of info on the R350 and it may be an incredible card or just a sucky one. Either way, it’ll probably bring down the price of the 9700 or the 9500 by a subtantial amount.

Does anyone know the approximate maximum instruction count of the R300 chips? The NV30 is 65,536 (Got it from the pdf file in one of my previous posts on this thread). The FX is a HUUUUUUGEEEE improvement over the GF4 Ti in the shaders department . And I have a GF3 Ti right now, so you can imagine how much of a difference it or the R300 would make.

  • Halcyon

Originally posted by Tom Nuydens:
[b]Don’t get religous about it guys – they’re just video cards.

Get a Radeon 9500 or 9700 because:

  • It’s out now, and your patience is running out
  • It’s cheaper than the GeForceFX
  • It has a more elegant cooling solution
  • It will make Davepermen like you

Get a GeForceFX because:

  • It has some extra flexibility in the shaders (precision, flow control, instruction count)
  • NVIDIA has better OpenGL drivers than ATI
  • You can get the Gainward card which will allegedly have a quiet cooler
  • You can boast that your video card is more expensive than Davepermen’s

How’s that for a comparison?

– Tom

P.S.: No offense, Dave [/b]

if expensive counts for you, yes, buy it. i prefer to smoke my money

and about the drivers… thats an old one. old but wrong today.

I have a few questions for all you guys that do shader programming…i’m still waaay back around lighting . Is the introduction of floating point fragment shaders very recent? I mean DX9 is boasting that as one of it’s new features. I’m guessing GL had it accessible through it’s extention mechanism. I mean were all the shaders before in 12bit format (fixed point)?

Also, is a fragment shader in OpenGL the same as a pixel shader in DX? I mean a fragment is what a pixel is when all the calculations are being done on it.

  • Halcyon
  • for radeon: you can get a passiv-cooled one. while 7db aren’t possible to actually hear, quiet is even bether

another thing, about the “advancedness” of gfFX. if i would overclock my radeon, i would get about linear speed increase (can clock to 420mhz with normal drivers…)

at 500mhz, the card is 1.5384615384615384615384615384615 times faster. except possibly one test, it would beat the gfFX in every test, by up to 53.84615384615384615384615384615 percent, and more (there where the gfFX yet now is slower in the tests)

so the actual power, wich makes a gfFX as fast as a radeon, or a bit faster, is clockspeed. and this very clockspeed results in this enormeous heat it produces, even with smaller produced die.

a r300 chip with .15 and at 325mhz can nearly catch a nv30 chip with .13 and 500mhz. and the nv30 only beats because of its going to the limits. the fat old r300 isn’t even at its limits.

don’t let us start talking about a .13 r300 version and its clockspeeds possible… blah

THAT is what makes me sad. the gfFX does perform well. but it should do much more than it does currently with all the extraboosts they gavem.

and as i see there are quite much other dx9 compliant cards (with ps2.0 and vs2.0 only support) comming. then there will be ps3.0 and vs3.0… but i don’t see much ps2.x and vs2.x cards… i don’t think the additional features will get that much support, they are too less for 3.0, but too much for 2.0. nothing really useful… proprietary, as normally.

and i think currently a devteam cannot oversee the r300 as a target audience. so its more easy to develop for r300/gfFX with one path than two special paths. the r300 path will be supported by about all hw vendors in future… dunno about the gfFX path…

anyways, enough rant. i really want to see nvidia to catch up. their gfFX is a very promising hw. what went wrong at this first test, what went wrong?..

Originally posted by HalcyonBlaze:
[b]I have a few questions for all you guys that do shader programming…i’m still waaay back around lighting . Is the introduction of floating point fragment shaders very recent? I mean DX9 is boasting that as one of it’s new features. I’m guessing GL had it accessible through it’s extention mechanism. I mean were all the shaders before in 12bit format (fixed point)?

Also, is a fragment shader in OpenGL the same as a pixel shader in DX? I mean a fragment is what a pixel is when all the calculations are being done on it.

  • Halcyon[/b]

yep, very new. dx9 introduced it, first implementation is on the radeon9500+ cards, and second one is on the gfFX cards. no cards before had it. 12bit? actually most have 8bits per component, geforces got 9bits (an additional sign => their range is not from 0…1 but from -1…1)… dunno, the radeon8500 had even higher precision, i think. but fixed point anyays.

and floatingpoint unlimitedrange pixelshaders are a real advantage. its amazingly easy to work with, and very powerful… no need for texcoords or any vertexshader to send in lighting information. just calc it in the fragment program… sure, its not the most effient way to calc it all in the fragment program, but it shows how powerful that is. wasn’t possible at all before…

Yeah, I think it was all fixed point calculations. Not sure of the precision, though. I’m sticking with R300 for the moment, since the FX mostly adds support for the vs/ps 2.0 extensions. K, it’s much, atleast when you’re doing very high level effects, I mean the instruction count in R300 is 512 (with loops, correct me if I’m wrong) and 256 with no looping (again, correct me if…) and 1024 in NV30 (65,536 with looping), that’s a huge difference. But those kinds of instruction counts are only achieved with really complex programs.

But when the R300 was out is was worse in drivers than the FX, it almost took 6 months for them to have a good set of drivers. And that had an huge impact on performance. The drivers are as important as the hardware. If the drivers aren’t fast, the card can’t be fast.

That’s not true at all.

The 9700’s shipping drivers were good enough to beat a GeForce 4Ti 4600 by 50% on average (check the original Anandtech benchmarks).

Up until now, nVidia’s sudden driver performance upgrades were strategically timed to hurt ATi. It was certainly no coincidence that, on the week the 8500’s were released to benchmark sites that nVidia drivers suddenly caused a 20% performance boost. And nVidia knew months in advance that the 8500 was coming, so they could prepare for it.

nVidia has known, for the past 5 months, that the 9700 was a performance beast with excellent, fast-running, drivers and very fast running antialiasing and anisotropic filtering. Given that knowledge, they should have thrown every day of those 5 months into perfecting their drivers. If nVidia did, and this is the best they could do, the FX is an overhyped POS. If nVidia didn’t, then they don’t deserve to beat the 9700, and their FX line of cards will be crushed by the R350 core (which will, likely, be mostly a performance upgrade along with some modest improvements to instruction count and functionality).

Certainly, in the medium-end market, dominated by the all-powerful DX9-capable Radeon 9500Pro, nVidia will have its work cut out for it. The 9500Pro is a very fast card, capable of equalling the performance of a Ti4600.

However, if you’re an OpenGL programmer, you can get greater benifit from the FX. First, NV_vertex_program_2 supports looping/flow-control, which the ARB_vertex_program does not. Secondly, NV_fragment_program has more features than ARB_fragment_program or PS2.0.

Originally posted by PixelDuck:
Yeah, I think it was all fixed point calculations. Not sure of the precision, though. I’m sticking with R300 for the moment, since the FX mostly adds support for the vs/ps 2.0 extensions. K, it’s much, atleast when you’re doing very high level effects, I mean the instruction count in R300 is 512 (with loops, correct me if I’m wrong) and 256 with no looping (again, correct me if…) and 1024 in NV30 (65,536 with looping), that’s a huge difference. But those kinds of instruction counts are only achieved with really complex programs.

In the vertex shader the R9700 can use 256 instructions and with loops it can execute a total of 65026 (255*255 + 1) if I remember it right. In the fragment shader it can do 64 ALU instruction. The GFFX can do 1024.

Man…where do you guys learn all this stuff? I mean if someone wanted info on each of the individual posts, they’d find pretty much all of it in this single post!

Speaking of information, does anyone know much about the R350? All I know is that it is supposed to use less power than the GF FX and has support for VS/PS 2.0.

  • Halcyon

Originally posted by Humus:
In the fragment shader it can do 64 ALU instruction. The GFFX can do 1024.

I thought it was 160 for R300 (?).