Question about frustum culling

When perform frustum culling of points againt six extracted frustum plain , what kind of space the point have to be in ? world space(object space for static object) or view space.


Generally you construct the 6 planes out of the view-projection matrix. So, the points should be in worldspace.
(the idea is to avoid having to transform those points/spheres/bounding-boxes)

It’s totally up to you. There is no right answer. With infinite precision arithmetic, they’re all mathematically equivalent. Though you may find it is more efficient to do culling in one space rather than another.

Culling in eye space (sometimes incorrectly called viewing space) can often be a better option. For instance, in cases where your “world space” is huge (i.e. requires double precision or larger arithmetic to represent coordinates exactly). You’d like not to have to cull with doubles. And if you cull in eye-space you don’t.

Culling in view-space is called the “radar” approach. Check out some articles over at Spheres are sometimes culled in view-space, as frustum plane equations are simpler there. Also, culling in view-space means you don’t need to transform the frustum planes into world space (even though this is not usually a problem).

These days, you’ll often need occlusion culling more than frustum culling.


So in your engine, you throw all 360 degrees around the eyepoint down the pipe?

Seems this would be limited to pretty simple scenes, or odd cases where most/all of the scene objects are already in front of you.

Not all occlusion culling techniques are the same, if you use the GL occlusion query at runtime, you don’t need to build PVS beforehand. I find it solves most problems for me, frustum cull gives only moderate gains for me in this case.

BTW: Photon did you check out my simple cull algorithm? It actually works :slight_smile:

Interesting, how many objects per frame is this? Any hierarchy? Frustum cull with spheres or octree nodes is basically free, so I can’t imagine a faster approach :slight_smile: .

thank everyone for your answer and sorry for the late reply.

I still don’t have hierarchy culling in my engine and plan to use the frustum culling for light cull in deferred shader.

I didn’t say, that I don’t do frustum cull. I do it. But if I turn it off the fps rate does not drop a lot, sometimes not at all. The card is NVidia 8600. The scene is an outdoor scene, with geometry stuffed into AABBs.

Example from a specific viewpoint:

184248 triangles, without frustum cull
177016 triangles, with frustum cull

I get 18 fps in both instances. I don’t know how my card does its magic.

EDIT: I didn’t mention that I transform (project, if you like) the frustum AABB into the tile plane first and render only the tiles within the transformed frustum AABB. This might be viewed as a frustum cull in itself, but no tests are performed. Even with ~700000 triangles the fps drop to only 17, however.

The view frustum culling is very important in rendering large outdoor scenes. I have experienced a huge speedup after implementing an optimized radar approach on CPU (it is specific for my terrain algorithm and cannot be generalized, because it requires a ring-based space subdivision, and furthermore, each ring have to be divided into sectors (in my case - quadratic blocks)). The following numbers will tell you how important the view frustum culling is to my algorithm (the test-scene consists of more than 1.7M triangles organized into more than 900 VBOs):

Without VFC

per frame CPU time: ~20ms
per frame GPU time: ~20ms (~50 fps)/weak GPU

With VFC looking toward horizon

per frame CPU time: ~6.5ms (CPU utilization reduced to 1/3)
per frame GPU time: ~14.5ms (~69 fps)/weak GPU

With VFC looking directly to the ground

per frame CPU time: ~0.11ms (CPU is idle)
per frame GPU time: ~6.2ms (~161 fps)/weak GPU

[For the tests I have used my laptop with 8600M GT]

Comparing the first and the second result, we can deduce that GPU has some internal culling, because although everything is sent down the pipeline without VFC, the speed drop is not 3x, but only about 38%. Of course, there are also some state changes and transformations that do not depend on the number of VBOs drawn. In the test example the number of state changes is exactly the same in both first and second test. Because of lot of different parameters, the efficiency of internal culling cannot be easily estimated, but it cannot replace explicit culling.

Nevertheless, culling releases CPU by diminishing number of API calls (as we all know, driver executes on CPU). In the worst case scenario (in the example above), the drawing with VFC is about 3 times faster than without VFC.

A good stat to site with your benchmarks is the FOV you used.

Obviously for games that run with huge FOV (e.g. 160 deg), the max gain is less significant (maybe 2X or a little more for a horizon view).

But for those that run with one in the more reasonable 40-60 deg range, the horizon view cull speed-up can be huge. Theory says: 88% (40 deg FOV) to 83% (60 deg FOV) reduction in the amount of “stuff” you send down the pipe in draw and that you have to “touch” in CPU memory in cull.

This of course presumes you use some sort of log(N) spatial acceleration data structure for culling such as a BVH or spatial subdivision.

I’m sorry, I have forgotten to stat the FOV. :frowning:
Well, this application tries to mimic real lenses, built in the Axis 233D cameras (F1.4-4.2, focal length 3.4-110mm). Tests are done with vertical FOV = 40.5 (but because of wide screen of my monitor, horizontal FOV is about 76). Because of very rough approximation with the spheres (but also incredibly fast), speed gain is about 3x (which matches the number of drawn objects/blocks).

When it is said FOV of 40-60 degrees, it means “vertical” FOV. A horizontal FOV, due to widescreen monitors and additional toolbars and status bars in the application, is usually wider. Also the culling algorithms are not very precise. The faster the algorithm the less precise it is. So, gain speedup of about 3 times while looking toward the horizon is pretty realistic.

Depends on the application I suppose. I was talking horizontal. And for the case of multi-monitor/combined video, then the 40-60 deg horizontal heuristic is sometimes significantly greater than reality.

So of course we each have to tune carefully to match our application’s use cases.

Sure! :wink:

Thanks for the clarification! Soon we will try to make such application with split view and multiple video-beams (projected from the back of the screens). Maybe then I’ll have more questions about the issues with it. :slight_smile:

Aleksandar, post some screenshot maybe?

Here they are:

The engine is called from the sample application. In the status bar you can see framerates.

  • Eng FPS is FPS of the engine (inverse CPU time just for drawing).
  • GPU FPS is inverse GPU time.
  • View FPS is inverse time to draw a scene + additional calculations not related to drawing itself.
  • Real FPS is inverse redraw time in idle state (without movements).

Those are screenshots with two different datasets. A famous Puget Sound, and certain area in my country.

Maybe the screenshots are inappropriate for what you wanted to see. It was just datasets I used for the experiment mention above. I posted also a wireframe models to see the dense of the grid. If you need more details, just ask. Both CPU and GPU time on my desktop machine with GTX470 for the presented scenes is about 2ms (which implies about 500fps). But I posted screenshots that correspond to the results given previously.