Nvidia, stencil buffers and 16bpp

I’m sure everybody knows this problem:
Using stencil buffering on a TNT2 (for example) in 16bit colors gives you way slower fps than in 32bit color mode.
From what I’ve read so far, I learned that in 16bpp there is no room for a stencil buffer (Nvidia), so it has to be emulated (read in some posts here).
My question now is:
is only the stencil buffer software or does the whole rendering process switch back to software emulation (no hardware acceleration)

Any Nvidia-Developers here who could enlight me?

Simply do not use stencil in 16bit depth-buffer mode on current NV chips. None of the 16bit depth buffers support stencil in HW, and as you’ve seen, frame-rate will be horribly software driven. If you Enum-All-Pixel-Formats, you may find a 24-8 DepthStencil buffer on a 16 Color screen, but most of the NV chips i have don’t seem to offer this under OpenGL(Riva128, TNT1, GF1, GF2Go).

However, GF3 does support the 24-8 DepthStencil with 16 bit Color (565). However, i strongly suspect that this is a driver issue as only my GF3 has the latest.

What may be best in your situation is to always test your frame rate for an acceptable minimum speed, and if you can’t achieve that over a reasonable time-measurement, then try a different pixel format.

Thanx for your quick answer, but this wasn’t exactly what I wanted to know!
My question is:
Is there a way to emulate a stencil buffer in software without doing the whole rendering in software (all but the stencil hardware accelerated)?

I know, in 32bpp it is all hardware, but what happens in 16bpp when you use stencil buffering?

>>Is there a way to emulate a stencil buffer in software without doing the whole rendering in software (all but the stencil hardware accelerated)?<<

thats what it actually does
perhaps youre thinking that if u use the stencil buffer then everything will be done in software eg texturemapping etc but no they will still be done in hardware the stencil though will be done in software

Thanx, that was exactly what I wanted to know!

>>Is there a way to emulate a stencil buffer in software without doing the whole rendering in software (all but the stencil hardware accelerated)?<<

thats what it actually does
perhaps youre thinking that if u use the stencil buffer then everything will be done in software eg texturemapping etc but no they will still be done in hardware the stencil though will be done in software

Are you sure about that ? We run some tests of our current title on a range of nVidia cards a few weeks ago. GeForce3 seemed to perform fine in 16 bit colour depth, but the other didn’t.

Performance drop was enormeous, so I suspect it was a full software fallback. I never heard of this special mode, where stencil is done in software. How do I activate it ? Is it restricted to certain cards, and how does it fit into the invariance of the OpenGL pipe ? It certainly sounds interesting, if it can speed up stenciling on those modes.

Thanks !

  • Alex

Is there any reason to use 16bit framebuffer anymore?

> Is there any reason to use 16bit framebuffer anymore?

Some customers want that option, no matter what.

  • Alex

Ironically the Geforce3 is the first Nvidia chip that supports stenciling in hardware while being in 16bpp mode. But with the fastest chip on market you don’t really need 16bpp anymore…

About the complete software fallback:
Yes, it gets extremely slow, but where I saw it, it was way to fast for pure software IMHO

Anyone who developed these chips here to enlight us?

It doesn’t make any sense to have a mode where everything is hardware accelerated but the stencil buffer…

I would say such a mode does not exist but I’d be happy if Matt or Cass can confirm that…

Regards.

Eric

Why would such a mode not make any sense?
Stenciling doesn’t work in 16bpp because of technical limitations, so why not emulate only stenciling then?
The other question is wether it is possible, as stenciling is a per-pixel-operation…

Thinks about it a little… Most people must know that this T&L hardware would have no way to get transformed vertices out once they’ve been trasformed, this only makes sence. That means when you have a software stencile this T&L hardware is automatically out. Also I’d be guessing the software path and hardware path for just about everything is not invariant. That means if you want stencile to agree with the rest of the scene it’s all software or nothing.

It doesn’t make any sense because when the card is rendering what you ask it to, it has to access the buffers really quick in order to achieve some performance.

When all the buffers are in a damn-fast RAM stuck on the GPU, everything is fine… If one buffer happened to be on the CPU RAM, then the info would have to transit by bus (this is a good one !) and the card would have to wait for this info before going on…

Anyway… The card communicates with the main RAM for some things (i.e. textures) but I do not thing it would be wise to do this an a per-fragment basis, which you would try to do if you had a SW stencil…

In the end, it is probably possible to design a card that would do such a thing. I just doubt that any of the current ones can work like this…

Most probably, if you require a stencil buffer and it is not supported in HW, you won’t get ANY HW acceleration…

Once again, I am no HW expert so I would be glad if someone from ATI or NVIDIA (Matrox ? Just kidding…) could help…

Regards.

Eric

Not that I disagree with the last statements, but AlexH says Geforce3 performs OK. What’s with that? Is it hw accel in 16 bit mode?

V-man

Not that I disagree with the last statements, but AlexH says Geforce3 performs OK. What’s with that? Is it hw accel in 16 bit mode?

V-man

If you have to do software emulation of a feature, it’s not feasible to use the hardware to do anything upstream in the pipeline.

GeForce2MX was the first NVIDIA hardware to support 16bpp color and Z32S8 at the same time. GeForce3 also supports that mode.

Note that you can get hardware acceleration when you have stencil testing disabled, but that’s not very interesting.

Thanks -
Cass

32bits for the z-buffer plus 8 for the stencil?!

Oops. That’s Z24S8.

Cass