Is disabling a color attachment an optimization in a MRT, when it is not necessary?

Hi all,

I have a renderer system, which uses some shaders with MRT, but some times ( depending if I apply filters ) all targets are not necessary.

So I am disabling these color attachments with GL_NONE when they are not necessary.

My question is if in shaders that some ouputs color attachments are disabled, it discards the operations that involves these disabled outputs ( color attachmentss ).

Or if disable the ouput increase the performance in some other way…

Thanks in adavance,
Alberto

Changing FBO state on an active framebuffer is one of the most expensive kinds of state changes there is. So changing shaders would probably be faster, but there’s no guarantee on that.

Either way, it’s not going to be cheap. You should definitely sort objects based on this.

It should be noted that Vulkan doesn’t even allow this sort of thing at the granularity you’re using. If you want to disable writes to a color attachment, you must use the write mask. And write masks are part of pipeline state and cannot be dynamic. So that means having two pipelines.

Thanks Alfonse,

but I have a doubt, I’m only changing the framebuffer once per frame, when I bind it to render the whole scene.

Is it a wrong practice?

glDrawBuffers modifies framebuffer state. So you are in fact modifying the framebuffer more than once per frame. You may not be binding a new FBO, but you will be paying the cost of a framebuffer state change.

To the latter, there is a measurable performance cost to writing to 2 attachments vs. 1. Memory bandwidth you’d expect. But also…

To the former (and the latter), yes there’s a cost to the shader needlessly computing output values that aren’t needed (more registers, more cycles, more mem B/W, lower occupancy, etc.). However, it’s implementation dependent as to if the driver sometimes “discards the operations that involves these disabled [fragment color] outputs”, and if it does, what states will cause it to do so. (**)

Moreover, even if the driver does support this “state-based recompile” feature (e.g. for glColorMask*(), glDrawBuffers*(), etc.), there’s a serious real-time rendering performance cost to your app when the driver just up-and-decides that it needs to recompile your shader in the middle of rendering because you’ve never rendered with that shader + state combination before this run (or possibly ever!). For websearching, “state-based recompiles” are also referred to as shader patching, shader recompilation, shader reoptimization, or shader relinking. I’ve also included a few links on this below.

Bottom-line: On the shader side, as much as possible, I wouldn’t depend on the driver doing squat for you besides basic dead code elimination. And to the extent that it does more, you will want to bubble knowledge of that up to your shader permutation generator level so that you can “pre-bake” the permutations needed to avoid the driver needing to recompile your shader at run-time. This is one example of many where you should do this.

As to your question specifically, I would ensure that your shader doesn’t write to outputs that shouldn’t be written to in memory. Then the issue is all internalized to your shader code. If you can’t/don’t want to presume aggressive dead code elimination by the GLSL compiler, then you need to #ifdef out all code feeding those unused outputs. However, if you can (and want to) depend on the vendor’s GLSL compiler to aggressively eliminate dead code in your shader (as NVIDIA does), then you can do something like the following snippet. That is, have the code always compute all outputs in scratch vars (MY_ColorOut), but only declare and write to the ones that should be written to for this shader permutation (e.g. ColorOut[ i ] for i < NUM_COLOR_BUFFERS, a preprocessor define that you set at shader compile time).

#define NUM_COLOR_BUFFERS 1

layout( location = 0 ) out vec4 ColorOut[ NUM_COLOR_BUFFERS ];  

vec4 MY_ColorOut[ 2 ];

... Compute and write to MY_ColorOut[0..1] ...

ColorOut[0] = MY_ColorOut[0];
#if NUM_COLOR_BUFFERS >= 2
ColorOut[1] = MY_ColorOut[1];
#endif

(**) There is evidence that NVIDIA’s driver will do this, probably for both glColorMask*() and glDrawBuffers*() (NV_command_list, Stuttering in Game Graphics: Detection and Solutions). However, you have no guarantees that any driver will do this, or do this reliably.

State-based Recompile Links

Thanks for yout time and this elaborate anwser Dark_Photon.

Very useful information.

Some good, recent additions to this list: