Has anyone had any experience analysing the performance of OpenGL functions?
Intel’s VTune is very useful to find CPU bottlenecks, but it would be nice to be able to do something similar with the GPU.
I have my own opengl32.dll replacement which puts a wrapper around each GL function that gives me certain statistics such as usage per-frame, and an estimate of the total time spent in each function on the CPU. I would not like to rely on the figures obtained from the latter however.
I would guess that each video card vendor has their own analysis tool which can query the hardware directly, but I can’t imagine that ever being available publicly.
I’d be very interested to hear if anyone has ever done any work in this area.
You should realize that the time-per-function is almost useless for modern OpenGL implementations. They’re very likely to just prepare/preprocess/copy your data, and put it at the end of a large DMA FIFO, and just enqueue it in an asynchronous queue from which the card will pull commands when it’s available. When the queue gets full, whatever command happens to be issued at that time may stall, waiting for the queue to drain.
What you need to know for good graphics performance is whether you’re CPU, Fill or Transfer/Transform bound. You can do this by testing your program on different set-ups, or by doing things like glCullFace(GL_FRONT_AND_BACK) to turn off fill, or make the window smaller/bigger.
Ah yes, I’d not taken the queue into account.
I’ll have to have a rethink about how to test performance.