glDrawPixels+Shaders = slow?

I encountered a problem on a GF8800GTS (forceware174.74). When a GLSL shader is activated, glDrawPixels takes ages to complete (a couple of seconds!).

Can anyone confirm this? Is this to be expected or is it a driver bug? I did not see this on an ATI card (yet). I can easily disable the shader while glDrawPixels() is being called (which fixes the problem). I just wonder…

I’ve noticed the same thing when using glDrawPixels with not-directly-supported formats.

GL_UNSIGNED_BYTE and GL_BGRA should not be unusual :slight_smile:

I had the same problem so I just abandoned it. I simply tried to swizzle inside the fragment shader. I had this problem on the Geforce 7800GTX also.

I gave up on it, too. I think NVIDIA didn’t bother implementing this functionality in hardware since it’s seldom used anymore.

There isn’t some good use of it with pixel buffer object? I remember a good use of glReadPixels so it may be the same with glDrawPixels… with a pixel buffer as a source? (I’m really no sure on that)

I guess I could see it being useful for drawing something pixel-perfect without messing with transformations… hmm, it’s a shame it doesn’t work in HW.

I had the same problem so I just abandoned it. I simply tried to swizzle inside the fragment shader. I had this problem on the Geforce 7800GTX also.

What same problem? glDrawPixels+Shaders or glDrawPixels+BGRA?
I can choose to use RGBA instead, if it helps (will try tomorrow).

Maybe I need to clarify why I’m using glDrawPixels at all. Its for transferring an offscreen-Pbuffer-image into the main framebuffer with the least amount of extensions. I implemented it with FBO in first place, but that didn’t work out well on some customer’s machines :-/ Also, this function doesn’t need to be super-fast. It just must not take seconds instead of milliseconds :wink:

I guess I could see it being useful for drawing something pixel-perfect without messing with transformations… hmm, it’s a shame it doesn’t work in HW.

When using glDrawPixels you end up messing with many more states than you might imagine:

  • setup projection to orthogonal
  • identity modelview
  • disable all clipping planes
  • disable any shader (?!)
  • set appropriate depthfunc
  • set proper depth/color/stencil masks
  • glRasterPos()
  • finally glDrawPixels()
  • reset former state

Another advantage of glReadPixels()/glDrawPixels instead of glBlitFrameBufferEXT() is that apparently MSAA resolves are much better done by glReadPixels() than glBlitFrameBufferEXT().

Did you tried texture mapped screen aligned quad?

Ow, that’s good to know. Did you set any OpenGL Hints for that?

Did you tried texture mapped screen aligned quad?

OpenGL doesn’t support render-to-multisampled-texture (I need AA).

Ow, that’s good to know. Did you set any OpenGL Hints for that?

No, thats just my personal experience on nVidia-cards below GF8800. Somehow, on GF6/7 class hw, the MSAA-resolve-blit makes the image very blurry and even offsets it a bit… very weird! It can also fail completely (as stated already in some other threads). glReadPixels works reliably good on all cards.

Hmm, I just went over the spec again

Is ReadPixels (or CopyPixels or CopyTexImage) permitted when
bound to a multisample framebuffer object?

     RESOLVED, no
     
         Resolved by consensus, prior to May 9, 2005
     No, those operations will produce INVALID_OPERATION.  To read
     the contents of a multisample framebuffer, it must first be
     "downsampled" into a non-multisample destination, then read
     from there.  For downsample, see EXT_framebuffer_blit.
     The concern is fallback due to out of memory conditions.  Even
     if no memory is available to allocate a temporary buffer at the
     time ReadPixels is called, an implementation should be able to
     make this work by pre-allocating a small tile and doing the
     downsample in tiles, or by falling back to software to copy a
     pixel at a time.

I’ve also had some problems blitting from an MSAA FBO to the main render window but blitting from an MSAA FBO to single sample FBO seemed to work just fine…

I do not use glReadPixels on multisampled FBO. I have two codepaths for doing the job:

  1. FBO + glBlitFramebufferEXT (including a separate path with intermediate downsampling)
  2. PBuffer + glReadPixels/glDrawPixels (downsampling inherent)

I’d really like to use FBO only, but first there must be 100% working drivers and then the customers must be willing to update their drivers.