Switching Framebuffers vs. Attachment

bloodtoes · March 28, 2011, 10:42am

Just a quick query if anyone here happens to know this before I spend too much time on a test case.

I need to render to a series of textures offscreen, which will later be sampled. What method should I expect to be faster? A case where I have a different framebuffer for each pass each with a 2D texture attached, or a single framebuffer and switching its color attachment for each pass?

For example:

// method one: multiple framebuffers (fb1, fb2, etc..) 
// with corresponding texture attachment on color_attachment0 
// which has already been set up
glBindFramebuffer( GL_FRAMEBUFFER, fb1 );
// pass 1
glBindFramebuffer( GL_FRAMEBUFFER, fb2 );
// pass 2

or

// method 2: single framebuffer, switching out 
// attachment (tex1, tex2, etc...)
glBindFramebuffer( GL_FRAMEBUFFER, fb );
glFramebufferTexture( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex1, 0 );
// pass 1
glFramebufferTexture( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex2, 0 );
// pass 2

I suspect the former, but I want to be sure.

malexander · March 28, 2011, 11:22am

Both of them are likely to be quick enough to be insignificant in the grand scheme of things. There’s also a third option, adding multiple textures to the same FBO’s color attachments, and switching which one is being rendered to via glDrawBuffers(GL_COLOR_ATTACHMENT#). This would allow you to reuse a depth buffer, for example (much like Option 2). The usefulness of that depends on the number of textures you are updating, and the GL_MAX_COLOR_ATTACHMENTS value.

What will likely matter more is the order in which you render them, and then sample from them.

bloodtoes · March 28, 2011, 11:57am

I ended up doing the test anyway. They’re nearly identical on average time, but the 2nd method was more predictable. The variability of the first method fit a pattern over 5 iterations (each iteration consisting of 2000 binding switches with drawing). I was seeing stuff like:

method1: 74ms, method2: 69ms
method1: 72ms, method2: 69ms
method1: 68ms, method2: 69ms
method1: 64ms, method2: 69ms
method1: 62ms, method2: 69ms

And the pattern would repeat like that, every 5 iterations being in the same area (+/- 1%). In the end:

method 1 avg: 68.4ms, method2 avg: 68.8ms

I will try the other technique you mention and see if that is any different.

It was a pretty sterile and hackey test case and may mean nothing in reality. But in the end it does look more like a question of code design than performance.