Tests i can use before fragment program

from looking at the spec it seems perpixel fragment programs come before alpha,stencil,depth,scissor test etc.
basically since my fragment programs are pretty expensive (less than 5fps for a 640x480 window) im wondering if theres anyway i can bypass the fragment test if a certain condition is met, eg only perform on pixels whose screenalpha’s are greater than 0.5.

though i believe what i want aint possible
so my next best bet is to discard the fragment as soon as possible in the shader
whats the best way of doing this

FWIW i would assume this would be, stuck right at the start of the shader
if ( a > b )
discard;

BUT on my gffx5200 66.00drivers runs a lot slower than
if ( a < b )
{


}

cheers zed

Conditionals are not cheap. More importantly, on most cards, discard does not actually stop fragment processing; it merely says that the results will be ignored.

What you’re wanting to do cannot be done.

depending on what you’re trying to achieve, it may be worthwhile doing a cheap masking pass, and then the more expensive pass applied to the mask.

cheers i was afraid of that,
i was wondering how rigorous drivers stick to the spec, eg by rights the order of the tests is

1/expensive fragment shader
2/scissor test.

now from my understanding the frgament shader cant affect the scissor test and visa-versa but as the sissor test is prolly 50x quicker than the fragment shader im betting a lot of drivers swap the order of operation?

*note the scissor test is bugger all use to me in my case, im just using it as an example :slight_smile:

the following might be utterly useless, but anyway.

if possible “discard” as much as you can in the vertex program. there might be no clear cut solution for your problem, but anything from clipping planes, distance tests, vertex scaling to simply tagging pixels could be useful.

Originally posted by zed:
from looking at the spec it seems perpixel fragment programs come before alpha,stencil,depth,scissor test etc.
basically since my fragment programs are pretty expensive (less than 5fps for a 640x480 window) im wondering if theres anyway i can bypass the fragment test if a certain condition is met, eg only perform on pixels whose screenalpha’s are greater than 0.5.

Since stencil can’t be effected by the fragment program, then it can be done before.

If you don’t modify the depth, then here to you should get early z-reject and other such optimization.

Ditto with scizzor test

For alpha, I guess they will always do it after cause you are expected to write to color.

less than 5fps for a 640x480 window?

If you write very long shaders, then you may become shader limited. Shorten it and do some tests.

>>if possible “discard” as much as you can in the vertex program. there might be no clear cut solution for your problem, but anything from clipping planes, distance tests, vertex scaling to simply tagging pixels could be useful.>>

how do i discard them in the vertex program?, ie i was under the impression than the fragment would always get passed on to the pixelshader and from there they can only be tested on.

>>Since stencil can’t be effected by the fragment program, then it can be done before.
If you don’t modify the depth, then here to you should get early z-reject and other such optimization.
Ditto with scizzor test>>

hmm so youre saying that drivers dont obey the pipeline to the letter (which is good news)

>>less than 5fps for a 640x480 window?>>

the shader does quite a bit, lighting, shadows, parallax mapping.
fwiw ive just switched over from calculating attenuation linearly which lets me more accurately say where lighting influence ends, enabling a more accuarte pixel rejection.
cheers

Stencil, scissor and depth can be hoisted by the card to previous to the shader (if you don’t write depth in the fragment shader).

How expensive is it to just calculate your alpha contribution? You might want to do a pass that just writes depth information using alpha test and calculating your alpha, then in the next pass you don’t calculate alpha, but set the depth test function to EQUAL and trust the early Z tests to save you.

Sadly, though, the early Z doesn’t always work for EQUAL on all vendor hardware, so you may not see a big improvement. Another approach is to put the alpha into stencil, again using a first cheap pass.

gffx5200
That explains the performance right there. The FX5200 cards are not what you would call “performance friendly”. They’re budget cards and while they have all of the power of their bretheren, they have none of the performance. I wouldn’t expect a glslang shader of any real length to run at a decent speed on that card.

ok i removed the (*)zcorrect bumpmapping stuff, and layed down first a depth pass and then used equal with zwrites off, this is approx 10% quicker (compared to not doing a depthpass first).
sorry i feel as if ive gotten sidetracjed off the main issue and that is how to reject a pixel quickly in a fragment program.
eg

current state with 66.00 drivers

A/ discard doesnt help
B/ return is not supported (yet)
C/ if (…) doesnt help

is these due to the hardware or drivers or what?

PS. im aware the 5200 aint the quickest card (i had the choice between this and a 5700 but choose the 5200 cause it makes me work harder, plus also i dont play games fullstop thus speed aint important)

cheers

(*)no great loss as it really doesnt add much visually to the scene, am i the only one who thinks this?
then again if i can just add in depthmapping on the zcorrect bumpmaps, not the easiest problem :slight_smile:

An rough front-to-back rendering order of your objects could also help a little bit without the need to add the additional z-only pass…

@dorbie: what hardware/driver/vendor has problems with GL_EQUAL and early z-reject? And what about GL_LEQUAL?

I’m not dorbie, but I’m assuming you meant to ask me: LEQUAL is fine on all hardware. We’re told that EQUAL reduces acceleration on at least the Radeons (I forget about the others).

It’s a hardware issue. You can’t do early out without (ab)using early z and stencil testing on anything except the GeForce 6800 series.

Originally posted by zed:
[b]from looking at the spec it seems perpixel fragment programs come before alpha,stencil,depth,scissor test etc.
basically since my fragment programs are pretty expensive (less than 5fps for a 640x480 window) im wondering if theres anyway i can bypass the fragment test if a certain condition is met, eg only perform on pixels whose screenalpha’s are greater than 0.5.

though i believe what i want aint possible
so my next best bet is to discard the fragment as soon as possible in the shader
whats the best way of doing this

FWIW i would assume this would be, stuck right at the start of the shader
if ( a > b )
discard;

BUT on my gffx5200 66.00drivers runs a lot slower than
if ( a < b )
{


}

cheers zed[/b]
First, one thing to note regarding the < and > operators. Try to avoid using >, use >= instead whenever possible. Same with <=, avoid it and use < instead whenever possible. This is because < maps directly to SLT and >= maps to SGE (which is essentially just SLT with reversed operands). Other comparisons require more instructions.

Regarding early-outing from a fragment shader, take a look at this demo:
http://www.humus.ca/?page=3D&id=50

Basically, I draw the compare function to alpha in a first pass, and use alpha testing to set stencil to 1 for selected pixels. In the next pass early stencil rejection throws away the fragments, which saves loads of fragment processing.

In this case I’m using the light radius as the determining factor, anything outside the light radius will be rejected, but you can use any rejection critera.

cheers humus, yeah from further research it seems if (…) statements (unless they are constant bool’s) are useless on gffx’s for early out. im gonna have to implement something like u suggest on your site involving extra passes :frowning:

>>if possible “discard” as much as you can in the vertex program.>>

oh yeah to clarify this aint possible is it? ie it doesnt matter what happens in the vertex program the result will always get passed to the pixel shader

From the documents, it says that discard is for fragment shaders only. The compiler should flag an error if you use in a vertex shader.

I don’t see how a video card would handle a “discard” on the vertex processor anyway.
It would be like killing a polygon and any polygon that is sharing that vertex.

Ah, something for future hardware. Discard on vertices to perform polygon reduction in realtime. :smiley: :wink:

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.