Changing FBO state on an active framebuffer is one of the most expensive kinds of state changes there is. So changing shaders would probably be faster, but there’s no guarantee on that.
Either way, it’s not going to be cheap. You should definitely sort objects based on this.
It should be noted that Vulkan doesn’t even allow this sort of thing at the granularity you’re using. If you want to disable writes to a color attachment, you must use the write mask. And write masks are part of pipeline state and cannot be dynamic. So that means having two pipelines.
glDrawBuffers modifies framebuffer state. So you are in fact modifying the framebuffer more than once per frame. You may not be binding a new FBO, but you will be paying the cost of a framebuffer state change.
To the latter, there is a measurable performance cost to writing to 2 attachments vs. 1. Memory bandwidth you’d expect. But also…
To the former (and the latter), yes there’s a cost to the shader needlessly computing output values that aren’t needed (more registers, more cycles, more mem B/W, lower occupancy, etc.). However, it’s implementation dependent as to if the driver sometimes “discards the operations that involves these disabled [fragment color] outputs”, and if it does, what states will cause it to do so. (**)
Moreover, even if the driver does support this “state-based recompile” feature (e.g. for glColorMask*(), glDrawBuffers*(), etc.), there’s a serious real-time rendering performance cost to your app when the driver just up-and-decides that it needs to recompile your shader in the middle of rendering because you’ve never rendered with that shader + state combination before this run (or possibly ever!). For websearching, “state-based recompiles” are also referred to as shader patching, shader recompilation, shader reoptimization, or shader relinking. I’ve also included a few links on this below.
Bottom-line: On the shader side, as much as possible, I wouldn’t depend on the driver doing squat for you besides basic dead code elimination. And to the extent that it does more, you will want to bubble knowledge of that up to your shader permutation generator level so that you can “pre-bake” the permutations needed to avoid the driver needing to recompile your shader at run-time. This is one example of many where you should do this.
As to your question specifically, I would ensure that your shader doesn’t write to outputs that shouldn’t be written to in memory. Then the issue is all internalized to your shader code. If you can’t/don’t want to presume aggressive dead code elimination by the GLSL compiler, then you need to #ifdef out all code feeding those unused outputs. However, if you can (and want to) depend on the vendor’s GLSL compiler to aggressively eliminate dead code in your shader (as NVIDIA does), then you can do something like the following snippet. That is, have the code always compute all outputs in scratch vars (MY_ColorOut), but only declare and write to the ones that should be written to for this shader permutation (e.g. ColorOut[ i ] for i < NUM_COLOR_BUFFERS, a preprocessor define that you set at shader compile time).