Is this bug being worked on? Occurs on an 8800GTX and 1500M.
Yes. It will be fixed in a future driver release. Thank you for taking the time to produce a simple repro app, this greatly helped us.
This issue is caused by a problem that could sometimes happen when doing a downsample blit directly into the window. To work around the issue on the current drivers, you could modify your application to perform the downsample blit into a second single-sample FBO, and then do a 1:1 blit from the single-sample FBO to the window.
Thanks for acknowledging this. Is a 1:1 inverted blit slower than a regular 1:1 blit?
Here’s what I’ve been using as a workaround, which looks exactly like what you suggested:
GLuint drawFramebuffer = 0;
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, drawFramebuffer );
CHECKGL;
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, rt.fbo->_handle );
CHECKGL;
GL_CHECK_FRAMEBUFFER_STATUS( GL_FRAMEBUFFER_EXT );
CHECKGL;
GL_CHECK_FRAMEBUFFER_STATUS( GL_DRAW_FRAMEBUFFER_EXT );
CHECKGL;
GL_CHECK_FRAMEBUFFER_STATUS( GL_READ_FRAMEBUFFER_EXT );
CHECKGL;
const int srcWidth = appWindow.width;
const int srcHeight = appWindow.height;
const bool flip = GetBool("r_postProcessFlipFBBlit") ? true : false;
const float scale = GetFloat("r_postProcessScaleFBBlit");
#pragma warning( disable : 4244 )
const int dstWidth = (float)appWindow.width * scale;
const int dstHeight = (float)appWindow.height * scale;
#pragma warning( default : 4244 )
const GLenum filtering = GetBool("r_postProcessFilterFBBlit") ? GL_LINEAR : GL_NEAREST;
{
glBlitFramebufferEXT( 0, 0,
srcWidth, srcHeight,
0, 0,
dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );
}
if( flip ){
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, 0 );
CHECKGL;
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, g_rb1.fbo->_handle );
CHECKGL;
glBlitFramebufferEXT( 0, srcHeight, srcWidth, 0, // reverse Y
0, 0, dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, g_rb1.fbo->_handle );
CHECKGL;
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, 0 );
CHECKGL;
glBlitFramebufferEXT( 0, 0, srcWidth, srcHeight,
0, 0, dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );
CHECKGL;
}
Originally posted by CatAtWork:
Is a 1:1 inverted blit slower than a regular 1:1 blit?
As far as I know, a 1:1 inverted blit should always run at the same speed as a 1:1 non-inverted blit.
Here’s what I’ve been using as a workaround, which looks exactly like what you suggested
Almost. In spirit these have the same semantics. In practice it looks like your snippet does one more blit than I had in mind.
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, rt.fbo->_handle );
if( flip ) {
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, g_rb1.fbo->_handle );
} else {
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, 0 );
}
glBlitFramebufferEXT( 0, 0, srcWidth, srcHeight,
0, 0, dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );
if( flip ){
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, g_rb1.fbo->_handle );
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, 0 );
glBlitFramebufferEXT( 0, 0, srcWidth, srcHeight,
0, 0, dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );
}
Oh, I see, the inversion only happens when blitting from a multisample FBO to the window, not from multisample to single-sample.
I’ve added the two-blit path, but it’s significantly slower! Looking into it now.
I’m not sure why the 2 blit approach is slower than 3, but here’s another repro app.
http://www.effloresce.com/cat/opengl.org/fbo_blit_perf-20061213.zip
It’s 12megs, because I didn’t have a whole lot of time to prune.
I would maximize the window to something large, 1600->1900 width hopefully.
My only thought is that after the maximization the allocation order of the framebuffers is not optimal. They’re created when r_postProcessEnable 1 is executed, not at the beginning of the gl context creation.
r_postProcessMultisamples X, (I used 8 and 16)
r_postProcessFlipFBBlit 1, ( enables a flip in the 3 blit path)
r_postProcessEnable 1
r_postProcessAllowFBBlit 2, for the 3 blit path that I posted
r_postProcessAllowFBBlit 3, for your path, Jeff.
r_timeGL 1 for EXT_timer_query -based FPS.
ocean_useShader 1 to perform some heavy per-pixel work.
image_anisotropic 8 or 16 to get rid of the texture2DProj artifacts at a distance