Performance Problem with Multiple Render Targets

Hello everyone,

I’m working on a deferred shading implementation. One of my shaders (glsl version 410 core) runs on each light in the scene and writes the lighting contributions to a framebuffer, using additive blending. This all works, but performance takes a huge dive with deferred shading, from 50fps to a little under 20fps.

It seems like the bottleneck is where I write my lighting calculations into the FBO. With everything else held constant, if I directly output my lighting calculations to the screen then everything works fine, but if I write to the framebuffer then the performance suffers.

I guess this has to do with the way that I’m initializing the textures in my framebuffer, but I’m at a loss for how to fix it. Below is my code for initializing the framebuffer and for writing to it:



    // Bind two framebuffers with diffuse and specular light textures to write to
    GLuint lightPass;
    GLuint diffuseAttachment, specularAttachment;

    glGenFramebuffers( 1, &lightPass);
    glBindFramebuffer( GL_FRAMEBUFFER, lightPass);

    glActiveTexture( GL_TEXTURE0 );
    glGenTextures( 1, &diffuseAttachment);
    glBindTexture( GL_TEXTURE_2D, diffuseAttachment);
    glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST );
    glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST );
    glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA16, width, height, 0, GL_RGBA, GL_FLOAT, NULL );
    glFramebufferTexture2D( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, diffuseAttachment, 0);

    glActiveTexture( GL_TEXTURE1 );
    glGenTextures( 1, &specularAttachment);
    glBindTexture( GL_TEXTURE_2D, specularAttachment);
    glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST );
    glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST );
    glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA16, width, height, 0, GL_RGBA, GL_FLOAT, NULL );
    glFramebufferTexture2D( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, GL_TEXTURE_2D, specularAttachment, 0);


Before I run the shader, I bind my framebuffer:


    glBindFramebuffer(GL_FRAMEBUFFER, lightPass);
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    glClearColor( 0.0f, 0.0f, 0.0f, 0.0f);

    GLenum buffersToDraw[] = { GL_COLOR_ATTACHMENT0 , GL_COLOR_ATTACHMENT1 };
    glDrawBuffers( 2, buffersToDraw );

    glDisable(GL_DEPTH_TEST);
    glDepthMask(GL_FALSE);
    glEnable(GL_BLEND);
    glBlendFunc(GL_ONE, GL_ONE);


And here is the shader where the framebuffers are written to:



#version 410 core

layout(location=0) out vec4 out0;
layout(location=1) out vec4 out1;

// uniforms etc...

void main(){

    vec3 diffLight, specLight; 

    //Lighting calculations...

    out0 = vec4(diffLight, 1);
    out1 = vec4(specLight, 1);

}


I do not think this is a problem with the lighting calculations, since as I said, the problem only arises when I have the last two lines of the shader in place. I didn’t have this issue when I worked on a Linux machine with more ram so I think this is a hardware problem.

My machine is OS X, Graphics Intel Iris 1536 MB, Memory 8 GB 1600 MHz DDR3

Any thoughts on why writing to the texture is so slow?

Yep. Websearch intel iris deferred and you’ll see what Intel recommends you do. It’s not multipass additive blending. The issue likely is that you’re using an integrated GPU which has a tiny fraction of the mem bandwidth of a decent discrete GPU, and it’s not a tile-based GPU.

Also, check that RGBA16 is a format natively supported by your drivers, both for texture lookups and for render targets. As a performance comparison, bench against RGB8 and RGB5_A1.

And what resolution are you rendering? Try cutting your resolution by half and compare performance.