Performance Identifying Problem

I cann’t identify the bottleneck of my program. I’m now using gDEBugger. The current FPS is 3, but when I “Eliminate all draw commands”, The FPS climbs to 7 more. According to gDEBugger User Guide, the bottleneck is NOT CPU/GUS, so it should be on the GPU side. However, the “GPU idle counter” in gDEBugger shows that GPU is about 50% idle. I’m wondering why this can happen. Would someone likes to help me?

If you use lot of commands that require synchronization you could end up with such results.
Using glReadPixels, glGetTexImage, glFinish or some other commands can cause this.
For example - you draw some polygons, then use glReadPixels and process received image on CPU you get the following:

  1. CPU sends commands
  2. GPU renders image and CPU waits for GPU inside glReadPixels command
  3. CPU reads image from GPU
  4. CPU processes the image, GPU waits
    Try to draw some sort of time diagram - if you place all operations done by CPU and GPU on it you could probably find some that could be done in different order and therefore overlap more.

Thanks for your reply. I donnot know whether it’s a synchronization problem. The features of my program is as follows: issues only drawing commands.’s using glMultiDrawArraysExt’s using vertex buffer object.’s using a 1024*1024 texture.

ps: I’m using Geforce 6600 GT with 256M memory.

So it’s not the CPU-GPU sync.
My proposal is to add some benchmarking to your application and measure time needed for execution of every major part of code - always add glFinish before you start and before you end such measurement. This will tell you how much total time CPU and GPU spend on certain parts of code and therefore narrow it down. You could then post some code that you think should execute faster than it does and we can take a look at it.

Then will you please explain or just give me an example for using glFinish to identify the time spent on CPU and GPU on certain parts of code?

   // start counting time
   // do GL stuff
   // end counting time, and print it

You will have more overhead because of the finish, but it will allow you to do a relative compare on each code section

Hi Wilburn

If I understand you correctly, please correct me if I am wrong, you are referring to NVIDIA’s NVPerfKit “gpu_idle” counter.

The below appears in NVPerfKit 2.1 user guide:
“gpu_idle counts the number of clock ticks that the GPU was idle since the last call. This value is automatically divided by the total number of clock ticks to give the percentage of time that the GPU was idle”

This means that “gpu_idle” is a hardware counter. Notice that the graphic system contains also software parts, which are executed on the CPU: The system’s OpenGL module (OpenGL32.dll) and NVIDIA’s driver. It might be that some time is spend in these parts, leaving the GPU idle. When you press “Eliminate all draw commands”, all activities, including most of those software activities are eliminated.

To identify the bottleneck, I would recommend to:
a. Check that there is no driver software fallback.
For this, you can use gDEBugger’s GLExpert integration:

  • Breakpoints -> NVIDIA GLExpert settings.
  • Mark “Break on GLExpert reports”
  • Mark “Report Software Fallback messages”

b. Use the function calls statistics view to check if you are executing too many OGL commands / redundant state changes / etc.

Are you using any unusual OGL states values / data formats ?


Yaki Tebeka
The gDEBugger team

I think I’ve got yours now, k_szczech and ZbuffeR, I’ll use it in my program. Thank you, guys.

Hi Tebeka,

I just post my reply for your on Please check it when you’re free. Thanks.