should i turn off AA with fullscreen quads

If you draw a triangle to cover the entire screen, I think the hw will clip it and you will end up with 2 triangles still.
I would guess that it doesn’t actually clip triangles to the scissor box. It merely culls fragments outside the box.

My main question is this: if I use a quad, I am guaranteed pixel-perfect alignment. How can I get that guarantee when I’m going to get floating-point rounding error on interpolation?

Wouldn’t the triangle have to be very large, compared to the size of the screen?
does it for matter?, surely clipping a small part of the tri aint much better than clipping a lot of it. (my mentioning of adding one offset was more for safety with rounding)

I’m not sure if there’s a “best” method. As long as it covers the entire [-1, 1] range in clip space it’s fine. Two layout’s I’ve been using are:
(-1, -1), (3, -1), (-1, 3)
and
(0, 2), (3, -1), (-3, -1)
related to the above, method A clips less than method B (though ultimately does it matter?), method A seems more natural as well since u align a tris edge with the screens edge

sqrt[-1] mentioned something though about divided the screen up into smaller quads ( + i think ive heard something similar as well ) personally it seems not logical (but what would i know, ok the card processes everything in chunks, so perhaps theres something)

in my game i have (at least)

draw fullscreen quad depth pass
draw fullscreen quad horizontal bloom
draw fullscreen quad vertical bloom
draw fullscreen quad horizontal bloom
draw fullscreen quad vertical bloom

draw fullscreen quad horizontal DOF
draw fullscreen quad vertical DOF
draw fullscreen quad horizontal DOF
draw fullscreen quad vertical DOF

plus i think particle buffer + another depth pass

so thats quite a few fullscreen(*) quads im drawing, so improving this even by a single percent is worthwhile, since its a hell of a lot of pixels (more at lesser resolutions, then again the trend is for higher res’s so importance is less though the counterpunch is postprocessing something new )

(*)note fullscreen is actually 1:1 or 1:4 or 1:16 sized :slight_smile: depending on the rendering buffers mapping

btw wizard 3 posts in 6 years :slight_smile: great,
still trying to decifier that last one though.

(rant mode)
deleted
(/rant)

(edit) actually this would be a great topic for a pdf from nvidia or amd ‘how to draw a fullscreen quad’ which today is more pertinent then ever.

what’s funny is, theres no consensus here + its prolly the simpliest thing that a person can do in graphics

zed: Ain’t it great. I’ve been working not posting :wink: But I promise I’ll be writing more in the future, lol.

Korval: I’m sure clipping is done in any case. Rasterizing areas outside the viewport and then discarding them would be a waste of time.

Originally posted by sqrt[-1]:
[b] I find this interesting as in some console hardware docs, they recommend doing the fullscreen passes in a grid of quads (6x8 tiles? - not sure)

Something about not flooding the fragment pipe or something… (and most consoles use PC-like hardware) [/b]
I’m not a console guy, but I believe those tiles are screenspace points, so they are rasterized as squares instead of as two triangles. You could try implementing something similar on PC with pointsprites, but that would add some math to the shader for texture coordinate computation, so I’m not sure if that would be a gain.

Predicated Tiling .

Originally posted by V-man:
If you draw a triangle to cover the entire screen, I think the hw will clip it and you will end up with 2 triangles still.
Not unless it goes outside the guardband. It’s a bit old, but there’s a fairly good overview of how it works here:
http://developer.nvidia.com/object/Guard_Band_Clipping.html

I’m sure clipping is done in any case. Rasterizing areas outside the viewport and then discarding them would be a waste of time.
If clipping were happening, then there would not simply be one diagonal line as in the quad case; there would be many. Which would make this a totally meaningless idea from a performance standpoint.

Normally, clipping only happens if it is absolutely necessary. That is, if the polygon would break the plain of the camera.

Originally posted by Korval:
My main question is this: if I use a quad, I am guaranteed pixel-perfect alignment. How can I get that guarantee when I’m going to get floating-point rounding error on interpolation?
I really don’t think this would ever matter for anything. Not sure if you’d be “pixel-perfect aligned” with quads even. The triangle would be twice as large as the quad, so I assume at worst you lose one bit of precision.

Well, I tried using a scissored triangle in place of a quad. It did show a very slight speedup. Thanks for the tip!

This “Predicated Tiling” reminds me of … Tiled rendering on the PowerVR-based cards … been a long time.

ZbuffeR, predicated tiling is a very old idea, you could go back to pixel planes and see it implemented.

Various contemporary architectures have similar styles of framebuffer management, but it has long been understood that it is not free.

http://www.cs.unc.edu/~pxfl/

Thanks Dorbie, for the background info.

The full screen Triangle vs Quad performance seems to be a bit better known in the GPGPU community. GPUBench has a test dedicated to this. You can see that using a full screen triangle is slightly faster. Check the third graph on this page for results on 7800GTX: http://graphics.stanford.edu/projects/gpubench/results/7800GTX-7772/

I looked in console docs about the grid for fullscreen passes and it states that it can be better due to the GPU’s rasterization rules and minimizing texture cache misses. (8x1 grid seemed to be good for 1280x720)

Perhaps if someone is really keen they could write a test that cycles through a lot of different grid pattens for a given resolution to find the optimal one for different cards?

ill do some tests tonight
so thats 8 quads of (1280/8) x 720
ill try diving it up vertically as well
also perhaps 4 triangles centered on the screencenter is the way to go

there is something to spliting the screen up into smaller areas

using GPUBench
fpfilltest -r triangle -c1 -k 256 -n == ~1120m/pix
fpfilltest -r triangle -c1 -n == ~1090m/pix