Improving Performace

Rodrix · March 25, 2006, 1:33am

Hi all!
How are you doing!?

I am trying to boost up the speed of my game, let’s see anyone you can give me some tips…

Let me tell you about my program first…
My program consists of an open large environment with a single complex object in the middle that is animated. (balls in a pool table).

Environment consists of a plane floor, a mountain box, a skybox, three scrolling cloud layers, a fog scrolling plane, and some billboarded trees (static for now, since they are very far away, so illusion is still kept). Skybox Textures are 1024x1024 24-bit and MountainBox Textures are 1024x1024 32-bit.

Performace is:
-24fps| Any camera position from ground standing position that involves having the Complex Object in view.

-36fps| ANY camera position from ground standing position that does NOT involve having the Complex Object in view (That is just the view of the mountains and sky along the fog layer). Billboard trees don’t affect fps at all.

-18fps| Some (not all) camera positions that involve zooming in the complex object very close to the viewer, so that only the Complex object is seen. The object is actually large, so you don’t really have to zoom in that much, but rather get close to it.
-60fps| top view, 3 cloud layers moving and skybox

It looks like the mountainbox is taking a lot of fps, as there is like a 15 fps difference between (skyview) and (mountain & skyview).

Question 1) What can I do to improve performance?. Apart from back face culling, is frustum culling useful if i have only one object?! Should I divide the object in many parts!? Or should I research on “Cell-based occlusion culling”, or on “PVS-based arbitrary geometry occlusion culling” (Just read they existed).

Question 2) My camera works with inverse movement, that is, camara is always fixed at one position and the world moves in opposite direction through model transformation matrixes in a glMatrixPush/Pop nest. Should I use glLookAt and really move the camera?! Which method is recommended for my ‘world’?

Question 3) My far and near planes are 4000 and 100, and my fov is 45. Will changing this values improve my performance?

Thank you so much all in advance!

Cheers,
Rod

H.Stony · March 25, 2006, 1:59am

Q2: i would use gluLook at but i don’t think it will change performance
Q3: if your field of view is larger your graphic card may have to rasterize more.

try for example gluProject to determine if object is in view.

Komat · March 25, 2006, 7:01am

How you draw your geometry (immediate mode, memory arrays, display lists, vbo)?

ZbuffeR · March 25, 2006, 9:14am

When chasing bottlenecks, it is better to take in account (milli)seconds per frame, because fps does not add up linearly.

so you have :

17 ms sky box
28 ms sky box + mountain box
42 ms normal view (sky box + mountain box + complex object)
56 ms complex object filling screen

-> beware, looks like vsync is enabled, it is not good to measure performance optimization !

about optimizing the background :

for the mountain box, use alpha testing to discard fully transparent fragments, so there will be less alpha blending overhead.
try with a huge texture lod bias (check opengl spec), see it perf improves = would mean you are texture bandwidth limited.

what is your card btw ?

generic (and nv specific) tips for finding/fixing bottlenecks :
http://www.gremedy.com/help/gDEBuggerHelpFile_web/HDI%20-%20Find%20performance%20bottlenecks.htm
ftp://download.nvidia.com/developer/presentations/2004/SIGGRAPH/GPU_Performance_Tools.pdf
http://developer.nvidia.com/object/gpu_programming_guide.html

rgpc · March 25, 2006, 3:35pm

And what order do you draw your complex object, mountain box and sky box in?

Rodrix · March 26, 2006, 1:30am

Hey guys thank you so much for all your answers and for being so supportive!

I really deeply appreciate all your time! THANKS!

Originally posted by Komat:
How you draw your geometry (immediate mode, memory arrays, display lists, vbo)?
I use Display Lists for the Complex Object, Skybox, and Mountain box. Cloud Layers are drawn on the fly, but I think I should put it on a list too (although the sky doesn’t seem to be the problem). Floor, scrolling fog layer, and billboard trees are also drawn on the fly.

(ZbuffeR) what is your card btw ?
My card is an NVIDIA GeForce FX 5500 256 MB (Ver. 6.14) . Is it too slow?!

QUOTE]Originally posted by ZbuffeR:
beware, looks like vsync is enabled, it is not good to measure performance optimization !
[/QUOTE]

What is vSync and how should I disable it?! I am indeed using a simple home-made syncronization function myself, that corrects objects’s movements by a factor determined by the milliseconds per frame.
The Factor takes 17ms per second as reference, and if the game is running at that speed then Factor=1. If it takes more than 17ms then Factor>1. Therefore all object Movements get multiplied by Factor (so that if frames are skipped they move more to correct the lacking frames).

about optimizing the background :

for the mountain box, use alpha testing to discard fully transparent fragments, so there will be less alpha blending overhead.

try with a huge texture lod bias (check opengl spec), see it perf improves = would mean you are texture bandwidth limited.
Ok! I have changed Alpha Blending for Alpha Testing for the mountain box, and the billboarded trees. I got mixed results, ms changes depending as to what sides of my mountain box I am looking, but is much better than before!
~17 ms sky box + mountain box (some views)
~28 ms sky box + mountain box (some other views)
Can’t understand yet why (it appears it’s not related to the amount of mountain I see)

Sorry I didn’t get that part about “the huge texture lod bias”. Should I search for that keywords in the Opengl Specifications Books? Maybe you are taking about this: My “Implementation Specifics” (from GLInfo) says that my max texture size is 4096x4096. I am using 1024x1024…

(Rgpc) And what order do you draw your complex object, mountain box and sky box in?
I am drawing first:
1)skybox
2)3 cloud layers (uses blending)
3)mountainbox (uses alpha-test)
4)floor
5)scrolling fog (uses blending)
6)“complex object shadow”-textured quad (uses blending)

trees (uses alpha test)
complex object.

Would frustum culling help me here for optimizing the complex object filling screen view?

Thank you so much everyone I truely appreciate all your help!!! THANKS!

I will keep you updated if I manage to do some more tests…

Cheers,
Rod

ZbuffeR · March 26, 2006, 5:39am

Sorry I didn’t get that part about “the huge texture lod bias”. Should I search for that keywords in the Opengl Specifications Books?
The extension became part of opengl 1.4 :
http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_lod_bias.txt
Go for the pdf spec, search for lod : http://www.opengl.org/documentation/specs/

The ways you draw geometry does not looks like a bottleneck, especially if your cloud layers are simple quads. Maybe billboard trees.

About vsync : Vertical Synchronisation, means the display of rendered frame is done only when display has completely drawn the current frame. So you end up with only “submultiples” (is that the english term) of display refresh rate.
Under windows use wglSwapInterval(0); to disable this and draw as fast as possible. Or you can set it in some video drivers config (“always off”).

Try to render roughly front to back, to benefit from early z-culling (often present on modern hardware). That way background parts that would have been obscured by the complex object are not drawn.
For blended parts, keep the current order.
so you would have something like : 8 - 7 - 4 - 3 - 1 ; 2 - 5 - 6

Thank you so much everyone I truely appreciate all your help!!! THANKS!
Is it possible to be rewarded with at least some screenshots ?

Komat · March 26, 2006, 6:59am

Would frustum culling help me here for optimizing the complex object filling screen view?

If you cull on level of entire objects then it will help you if you have many objects and many of them are not visible in each view. For single complex object it will help only if object is invisible. Attempts for culling on per polygon level will likely make things worse because unless you are limited by vertex processing power or vertex upload rate, the card is really effective and throwing away invisible polygons.

There is one issue with current nVidia cards and alpha test. Use of alpha testing disables early z-culling until next clear (maybe not in all cases however if zbuffer write is enabled it is likely) so attempt to draw all opaque geometry that is not alpha tested before the alpha tested one.

nVidia has handy tool called NVPerKit NVPerfKit which can be used to determine performance bottlenecks on the card unfortunatelly it requires at least GF6600 cards if I remember correctly.

inaam · March 26, 2006, 7:54am

well, something i want to tell u and i ve noticed is that opengl is very slow API, games made in opengl are very slow than made in DirectX,

opengl does not suppourt Resolution Changin of the screen… that s y it is very slow,\

if u want to increase the performance of your game then make it again in DirectX and feel the diffrence…

jide · March 26, 2006, 9:52am

Originally posted by inaam:
[b] well, something i want to tell u and i ve noticed is that opengl is very slow API, games made in opengl are very slow than made in DirectX,

opengl does not suppourt Resolution Changin of the screen… that s y it is very slow,\

if u want to increase the performance of your game then make it again in DirectX and feel the diffrence… [/b]
Well… ehh… no.

It’s been a while I didn’t used Windows, but at that time, performances were almost the same.

Resolution changes is not left to GL but to the window manager (windows, X…)

One thing that could make your GL programs slow might be the fact that you didn’t have the proper drivers. Another one might be that the programs were bad implemented.

Rodrix · March 30, 2006, 8:53pm

Thanks guys for everything!
I now a bit stuck with college work,
so as soon as get more time I’ll try all your suggestions, and eventually post some screenshots!

Cheers!
Rod