Performance of glReadPixels() question


I’m developing an application that uses glReadPixels(). I was wondering if there is any information available about how the function performs with respect to different sized readbacks. For example, how much faster will a 10x10 readback be, compared with a 100x100 readback? Does it make any difference?



glReadPixels can slowdown your app in two different ways :

  • breaks GPU/CPU parallelism : it must wait for the GPU to have completed all its pending commands. Like glFinish() basically.
  • the data throughput needed by actual pixel data : even 100x100 RGBA (40k) should not be noticeable though.

So probably you will not see much difference between 1010 and 100100.

Fist of all, consider using the PBO(pixel buffer object) extension with readbacks.

To your actual question: 10x10 may be faster then 100x100, but the only way to find out is to benchmark it yourself. The second case will be transporting 100 times more data then the first case, but it doesn’t mean it will be 100 times slower (as there is some driver operation overhead). The actual result will depend on the implementation.

Still, with modern cards and up-to-date drivers a 100x100 readback should be very fast.

Edit: ZbuffeR was faster :slight_smile:

Readback speed can also depend on the color channel ordering. If the the implementation has to touch the data to reoder it from native to user defined ordering the readback will be slower than if it remains in the native ordering. As said above, it’s rather implementation dependend.

There is a demo on pixel bufferobjects on NVIDIA’s developer site. Search here
for “PBO Texture Performance”