OT - Nvidia CineFX Architecture (R300 response?)

http://developer.nvidia.com/docs/IO/3121/ATT/cinefx_whitepaper.pdf

Nvidia perhaps trying to take the wind out of the R300’s sails ?

A glorious piece of engineering none the less. drool

Mmm, very nice.

Just one thing caught my eye: vertex shaders - temporary registers - 16 up from 12 (page 8). Seems a bit low, especially since every other number has jumped massively (1024 fragment program instructions?!).

That’s the first time NVIDIA has disclosed anything related to future products. Well, they have to exceed the Radeon 9700. ATI will apparently lauch the Radeon 10000 around the same time as the NV30.

Do the R300 + NV30 have floating point frame buffers ? A pbuffer with render_to_texture might work if not ( since they have 128bit FP textures ).

EDIT: They do support 128bit FP textures, right ?

[This message has been edited by PH (edited 07-22-2002).]

According to this article,
http://www.extremetech.com/article2/0,3973,388801,00.asp

128 bit data can be written to scratch memory in the frame buffer. Of course my interest in this is for multipass shaders.

Is there any next gen card that will support opengl2 or at least the current specs of it? 3dlabs i think was claiming to but i just looked at the spec again and it requires float color. i thought nvidia would do it with theirs but they say no loops in fragment shaders. not to say that im not excited about this new gen coming just i thought i could see opengl2 soon.

if you read the documents and the topics and posts you would have known that the next gen will not yet provide full gl2 power, but dx9power. dx9 is near to gl2 in featurelist, except branching in pixelshaders. but they have to have floatingpointcolors and they do have it in. there will hopefully be as much as possible GL_GL2_extensions in the upcoming hw gen, and, as ati stated, this one r300 will not be gl2, but the next one will most likely be…

I don’t know about the Wildcat VP card ( regarding floating point color ) but it seems likely since 3Dlabs designed the GL2.0 proposal.

Looping can be implemented without an explicit looping instruction ( by inlining ) so it should be possible. With high precision floating point frame buffers and dependent texture reads, a multipass shader could implement all of the GL2.0 HLSL ( I think ).

It’s kinda sad that nVidia has been forced to react to the R300 like this. This “CineFX” merely describes the capabilities of the R300, but with higher instruction counts. That is the best advantage nVidia can come up with, when their card launches 4 months after ATi’s. And they didn’t even increase the number of constants all that much over NV2X (256. Still not enough to store all the matrices for high-quality skinning).

At appears that Nvidia will be targeting renderfarm business, as they aquired ExLuna today.

That’s where those extra intructions will come in handy.

16 texture units ?! Start writing some macros for enabling/disabling texstages

What I want to know is, can we use 16 textures at full-speed? Or is it more than 8 and it’s half-performance?

Majority of the time on the GF4, it’s pointless using 4 textures, as it’s faster to multipass, providing you can fit your algorithm into the constraints of multi-pass.

Should be sweet tho…

Nutty

Originally posted by kon:
16 texture units ?! Start writing some macros for enabling/disabling texstages
oh yes, and don’t forget about managing 6 texture-targets for each unit:
1D, 2D, 3D, cube, rectangle_NV, rectangle_EXT.
Great, isn’t it?

I’m planning on getting an NV30 later when it comes out and I REALLY REALLY hope this card, and even the radeon 9700 for that matter, will have double sided stencil support. I’m just dying to use that for my shadows.

-SirKnight

In “SIGGRAPH 2002: Interactive Geometric Computations Using Graphics Hardware” on the nvidia dev page they say in the “graphics hardware futures” topic that there will be 2 sided stencil testing. Now this may be beyond nv30 but all the rest of the things they meantion are discribing the same thing as the cinefx paper. So I guess a unoffical anwser to that is very likely yes.

as i remember from nvidia, even the official answer is yes. as well as ati stated it somewhere. sure, never an official press release, but if they talk about it in papers as feature of nextgen hw

oh, and btw. to all you fancy guys… now with the ps2.0 powered gpu’s we can do 8 or more fully accurate perpixel phongshaded lights. in one pass. what do we do about the shadows? 8 stencilvolume passes, or 24 cubemapshadowmap passesd are needed for this. we’re still FAAAAAAAAAAAAAR from softshadows away as i can see that… any ideas (except doing raytracing of the shadows at low res, there you could do all the 8 lightsources in only 2 passes…

How would you do the stencil test for multible lights? I can see shadow buffers but not more than one stencil shadowed light in one pass. Maybe you could put it in other buffers? That would be great if you can.

I was just thinking about how awesome it was going to be to be able to do so many lights in one pass, and now daveperman has to go and ruin it all! :slight_smile:

But he is right. This plays right into what I was thinking the other day. I realized that shadow maps would eventually ‘win’ because they are analogous to the ‘battle’ between the z-buffer solution (which is fragment based like shadow mapping) and analytical solutions to hidden surfaces.

The z-buffer was initially slower than the painter’s algorithm (or other solutions), but its complexity grew linearly with the number of polygons. The complexity of sorting polygons grows exponentially. So eventually it became cheaper and its simplicity allowed it to be hardware accelerated easily.

The same is true of shadows. While the accuracy and fill-rate requirements of shadow maps leaves a little to be desired currently (IMHO), eventually these will be solved. Also as polygon counts go the burden of silouette finding goes up significantly, and ways to hardware accelerate the process are hard to imagine (at least for me). One would not dream of using stencil shadows with hardware displacment mapping (If you have had such dreams, please tell me how ^_^).

For these reasons, I think that shadow maps will eventually be -the- solution for shadows. I think that Doom 3 is pretty much the greatest use we will ever see of stencil shadows.

Originally posted by Nakoruru:
[b]I was just thinking about how awesome it was going to be to be able to do so many lights in one pass, and now daveperman has to go and ruin it all! ^_[1]

sorry…

[b]But he is right. This plays right into what I was thinking the other day. I realized that shadow maps would eventually ‘win’ because they are analogous to the ‘battle’ between the z-buffer solution (which is fragment based like shadow mapping) and analytical solutions to hidden surfaces.

The z-buffer was initially slower than the painter’s algorithm (or other solutions), but its complexity grew linearly with the number of polygons. The complexity of sorting polygons grows exponentially. So eventually it became cheaper and its simplicity allowed it to be hardware accelerated easily.

The same is true of shadows. While the accuracy and fill-rate requirements of shadow maps leaves a little to be desired currently (IMHO), eventually these will be solved. Also as polygon counts go the burden of silouette finding goes up significantly, and ways to hardware accelerate the process are hard to imagine (at least for me). One would not dream of using stencil shadows with hardware displacment mapping (If you have had such dreams, please tell me how ^_^).

For these reasons, I think that shadow maps will eventually be -the- solution for shadows. I think that Doom 3 is pretty much the greatest use we will ever see of stencil shadows.[/b]

sorry again

i hope not… at least, for real shadowmaps, you need cubemaps. cubemaps means 6 renderings for a single pointlight. there a simple shadowvolume pass and a lightingpass eats up much less… on the other hand, if we can bet the boolean comparison BEFORE the filtering but right AFTER the sampling, we can get smooth shadows by sampling over the right area on a texture quite simple (with 32 samples or more from the anysotropic filter no problem to sample a long line…)

we’ll see, but i think currently shadowvolumes can make the race… for quite a while…

just depending on what lightsources (a lot of lightsources can be projective actually)


  1. /b ↩︎

linky change…
http://developer.nvidia.com/docs/IO/3121/ATT/CineFX-TechBrief.pdf

hehe…ya gotta love Nvidia’s marketing.

“NVIDIA’s “CineFX” architecture enables real-time cinematic-quality rendering for the first time ever!”

Didn’t they say that last time?