Just a quick query if anyone here happens to know this before I spend too much time on a test case.
I need to render to a series of textures offscreen, which will later be sampled. What method should I expect to be faster? A case where I have a different framebuffer for each pass each with a 2D texture attached, or a single framebuffer and switching its color attachment for each pass?
For example:
// method one: multiple framebuffers (fb1, fb2, etc..)
// with corresponding texture attachment on color_attachment0
// which has already been set up
glBindFramebuffer( GL_FRAMEBUFFER, fb1 );
// pass 1
glBindFramebuffer( GL_FRAMEBUFFER, fb2 );
// pass 2
or
// method 2: single framebuffer, switching out
// attachment (tex1, tex2, etc...)
glBindFramebuffer( GL_FRAMEBUFFER, fb );
glFramebufferTexture( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex1, 0 );
// pass 1
glFramebufferTexture( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex2, 0 );
// pass 2
I suspect the former, but I want to be sure.
Both of them are likely to be quick enough to be insignificant in the grand scheme of things. There’s also a third option, adding multiple textures to the same FBO’s color attachments, and switching which one is being rendered to via glDrawBuffers(GL_COLOR_ATTACHMENT#). This would allow you to reuse a depth buffer, for example (much like Option 2). The usefulness of that depends on the number of textures you are updating, and the GL_MAX_COLOR_ATTACHMENTS value.
What will likely matter more is the order in which you render them, and then sample from them.
I ended up doing the test anyway. They’re nearly identical on average time, but the 2nd method was more predictable. The variability of the first method fit a pattern over 5 iterations (each iteration consisting of 2000 binding switches with drawing). I was seeing stuff like:
method1: 74ms, method2: 69ms
method1: 72ms, method2: 69ms
method1: 68ms, method2: 69ms
method1: 64ms, method2: 69ms
method1: 62ms, method2: 69ms
And the pattern would repeat like that, every 5 iterations being in the same area (+/- 1%). In the end:
method 1 avg: 68.4ms, method2 avg: 68.8ms
I will try the other technique you mention and see if that is any different.
It was a pretty sterile and hackey test case and may mean nothing in reality. But in the end it does look more like a question of code design than performance.