ps and vs on Geforce 2

First off, Im a newbie when it comes to ps and vs (or lets call it fp and vp).

There is zero support in the hardware when it comes to these on Geforce2 GPUs?

vp programs get proccessed on the CPU and you get reasonable FPS. But fp requires hardware support or else you get under 1FPS.

Another thing, why 1 FPS? Is the image getting generated in RAM and being uploaded via the CPU or the CPU renders to AGP memory and the card picks it up when it gets a swapbuffer call?

Unfortunatly I havent got the time to go through all the pages about fp & vp.


I’m not even sure you’ll always get “decent” performance with vertex programs. I have never tested it, but my guess is, if you store your vertices in video memory with VAR, and if the CPU needs to process them with a software vertex program, it will have to read from video memory, hence killing your framerate.

As for pixel shaders, well there’s limited support for them on the GF2. Actually, pixel shaders is a separated in fragment programs and register combiners. The first one is used to sample the texture maps (and do dependant texture reads). It’s not available on the GF2. However, the second one (register combiners), though more limited than the GF3/GF4, is still usable at reasonnable speeds. You can implement per-pixel lighting, dot3 bumpmapping and this sort of stuff with them.

Why 1 FPS ? I’d say because each time you’re drawing a pixel you need to execute the instructions of the pixel shader, probably on top of a virtual machine. It might as well take thousand of clocks just to draw one pixel. I’m not even sure it has anything to do with transfers of the final image to the video card…


*fp are not optimized as vp cause even if they were performance would still suck (so why bother!)
*how many pixels are there on a screen 480x640 = 300,000 (though not all are covered) say 100,000 try sticking 100,000 vertices on screen with vp + see your performance
*fp require the whole pipeline to be done in software

OK, probably fp are not optimized at all, but that’s fine.

The thing is, I was experimenting with rendermonkey, and even when I used the most basic vp and fp and set them both as version 1.0, it was running dog slow.

I know that vp only programs run at decent framerates (D3D and GL), so Im thinking either using fp on Gf2 is not possible or rendermokey is screwing up something.

I forgot what the exact program was at this moment. I have to get back home.

PS: even deleting the fp, it was running at 1FPS or something and it says HAL is not supported.


Originally posted by zed:
*fp require the whole pipeline to be done in software

Reminds me this quite recent IOTD, by the author of SoftWire:
I guess CPUs are getting so powerfull now, so the frontier between software and hardware rendering will become fuzzier…


hehe, yeah, nicks rastericer is amazing, and software rendering will get quite some bigger support with multiprocessor-pc’s (say, the new p4’s , and the future amd processors), as you can run the game then indipendend of the graphics, the mainproblem when you do it in software…

i would love to see nick working for, as they code it all in plain c, with x86 compliant compiler settings. no sse, no 3dnow, not even mmx… i would love to see nick integrating softwire there in, would for sure be blazing fast… and it looks amazing as well…

btw, the “ps” on geforce2 is not very advanced, but quite useable anyways. it does not provide texture shaders, but the register combiners are partially in, and while complicated, they are damn powerful… you can get diffuse and specular bumpmapping with ^16 for the specular in one pass if you mess it up correctly. biggest problem is the lack of 4 texture units actually, the two general combiners for gf2 would be enough for most jobs…

it’s not much, but at least realtime…

Originally posted by deepmind:
I guess CPUs are getting so powerfull now, so the frontier between software and hardware rendering will become fuzzier…

it is nice but that demo does no fragemnt program effects. imagine if u started doing mathmatical equations to each pixel (instead of a few simple mults etc now) the fps will drop from 20fps to 0.2fps.
personally ild love to see nick use his great talent doing a raytracer or radiosity thingee (+ not trying to emulate a nvdia tnt)

Dont forget that he says his screen is at 640x480

So here’s the fp (DX ps)

def c0, 1.0, 0.0, 0.0, 1.0

mov r0, c0

Here’s the vp (DX)

m4x4 oPos, v0, c0
mov oD0, c8


Can it be made any simpler to run on Geforce2 in hardware? Or maybe the problem is one of the instructions.


Ok, since you said you’re using RenderMonkey to run these shaders, a couple of things: RenderMonkey only uses D3D. NVIDIA’s D3D drivers do not support pixel shaders on GF2-level hardware, since the spec for even version 1.0 pixel shaders is too much for a GF2. That’s why RenderMonkey told you that it can’t use the HAL (the hardware) device. Instead it’s using the D3D reference software rasterizer whenever you use pixel shaders. The D3D reference rasterizer does everything in software.
So no matter what you put into your pixel shader, it will always be slow. In other words: D3D pixel shaders on a GF2 won’t work. All you can do, as others have pointed out, is use the two register combiners that are available on a GF2 in OpenGL.

I figured that there would be some basic support at least. One of the D3D demoes uses ps 1.0 and those too run on reference. yikes!

According to Ysayena, there is zero hardware support for NV_fragment_program. register combiners is the way to go but that’s old news now.

Thanks Asgard and all.


Also, does the new NV SDK contain demoes on texture shader extensions, NV_fragment_program and all the other recent extensions?

I dont want to download it and then find out it is no different then what I have.


Yeah, it has demos of all those. If you don’t want to download the whole thing or look at the stuff first check out their CVS