Bind different-sized render targets

nostalgic · February 10, 2015, 2:48am

Hello there,

is there any way to draw to multiple render targets at once which have different sizes or number of samples? It has not been possible for years, but I hope I just missed some extension that could help me. Here is the thing: In various applications, the geometry to be rendered is generated on the fly in either the tessellation or geometry shaders. This can be quite an expensive task, so you’d want this step to be executed only once per frame. Now you want to shade the result once for your main render target (RGBA8, Full-HD with MSAA), and once you need e.g. the normals in RGB16F for SSAO, but a quarter of the screen resolution will suffice for this. (How) can you render this in one pass in OpenGL?

As far as I know, traditional MRT still do not support different sizes among the textures/renderbuffers. The second possibility is layered rendering, which also requires the layers to be of the same texture and thus the same size and format, and furthermore requires to rasterize the geometry twice. It is clearly not an option to make the normal texture a Full-HD 8x multisample texture just for the sake of simplicity, the waste of VRAM (of which I have too little, thus the question) is insane.

I had the idea to do this all in the fragment shader via image load/store and write the normals there, but to be honest, this does not seem like the proper way to do it.

What I need and want is:

writing multiple values from one fragment shader invocation
the render targets for the values have different size and sample count (or even mixed TEXTURE_2D and TEXTURE_2D_MULTISAMPLE)
only one geometry shader invocation
in one draw call

Anyone? I’d be glad to get some comments on this.

Regards,
nostalgic

Osbios · February 10, 2015, 3:41am

When your vertex calculation are expensive you should use https://www.opengl.org/wiki/Transform_Feedback

You basically calculate all the vertex data once and write them into a buffer, and then use this buffer for future drawing commands.

nostalgic · February 10, 2015, 4:59am

[QUOTE=Osbios;1264287]When your vertex calculation are expensive you should use https://www.opengl.org/wiki/Transform_Feedback

You basically calculate all the vertex data once and write them into a buffer, and then use this buffer for future drawing commands.[/QUOTE]

Thank you for your quick answer, but transform feedback is not what I need here. I tried to abstract my application a bit to formulate my needs most generally, but imagine the following: I render my terrain with view-dependent tessellation and culling, so this is usually only valid data for one frame. Buffering this is not worth the space it occupies. Drawing the terrain again from the feedback buffer only avoids the tessellation (which would be a performance gain, though), but still requires two separate draw calls and two rasterizations of the exact same geometry.

I realize that the rasterization depends on the render target size and samples, but it still feels like it should be possible to combine render targets, at least with the same size and varying numbers of samples. As I mentioned, it is possible to write to a multisample render target in the fragment shader and additionally write (a single sample) to a regular TEXTURE_2D via imageStore. I can even do depth testing manually and render only a subset of the processed geometry with correct depth sorting. I’d just like to do this in a way provided by the API. Two framebuffers with two different sets of render targets with different sizes written to in one draw call.

Alfonse_Reinheart · February 10, 2015, 7:53am

What I need and want is:

writing multiple values from one fragment shader invocation

the render targets for the values have different size and sample count (or even mixed TEXTURE_2D and TEXTURE_2D_MULTISAMPLE)

only one geometry shader invocation

in one draw call

Well that’s just not going to happen.

You can use layered rendering to render different fragment outputs to different images. But these outputs must be from a single layered texture (array or cubemap), whose layers must use the same image format. Furthermore, layered rendering requires a geometry shader, and a GS that’s going to either generate multiple primitives or use instancing to be invoked more than once.

Also, you aren’t allowed to have multiple images attached to an FBO if those images have different sample counts. So that’s a non-started. Different sizes is OK (and by “OK”, I mean that the total framebuffer size will be the minimum of all attached images). But using different counts does not work. And really, how could it be? The framebuffer’s sample count determines how a primitive is rasterized; to have different images in an FBO using different sample counts would require that the primitive is rasterized differently.

What you seem to want is multiple rasterization pipelines. And that’s just not really possible.

Now you want to shade the result once for your main render target (RGBA8, Full-HD with MSAA), and once you need e.g. the normals in RGB16F for SSAO, but a quarter of the screen resolution will suffice for this.

Normals do not need RGB16F precision. GL_RGB10_A2 or GL_R11F_G11F_B10F are perfectly acceptable formats for normals, which don’t require a half-float’s worth of precision.

And, just an aside: I find it rather odd that you’re using SSAO, but not some form of deferred rendering. The two generally go together.

And your example is a good case for that. You say that you don’t want the cost of an 8x multisample image just for normals and SSAO (I wouldn’t want the cost of an 8x multisample image for anything, but that’s just me). If you were doing deferred rendering, that cost wouldn’t be wasted, since you need to use the right normals with the right samples to compute lighting.

As I mentioned, it is possible to write to a multisample render target in the fragment shader and additionally write (a single sample) to a regular TEXTURE_2D via imageStore.

That’s an arbitrary write to an arbitrary memory location. It’s not even remotely the same thing as rasterization.

However, since it does exactly what you need it to do, why not use that? This kind of special case stuff is exactly why generic features like image load/store exist.

I can even do depth testing manually

You most certainly can not. The depth test requires a read/modify/write operation (when the test passes at least). And you can’t really do that from a fragment shader. Not within the same render call. Not even with atomic image load/store operations, because there is no ordering guarantee for fragment processing.

nostalgic · February 11, 2015, 10:00am

Thank you for pointing this out to me! I was not aware this is what I was actually looking for. You are right.

Normals do not need RGB16F precision. GL_RGB10_A2 or GL_R11F_G11F_B10F are perfectly acceptable formats for normals, which don’t require a half-float’s worth of precision.

And, just an aside: I find it rather odd that you’re using SSAO, but not some form of deferred rendering. The two generally go together.

And your example is a good case for that. You say that you don’t want the cost of an 8x multisample image just for normals and SSAO (I wouldn’t want the cost of an 8x multisample image for anything, but that’s just me). If you were doing deferred rendering, that cost wouldn’t be wasted, since you need to use the right normals with the right samples to compute lighting.

This is also true, but then again, I only tried to explain my general problem/requirement with an understandable example. In my actual application (which uses forward rendering), I use the normals for several purposes, including interaction with 3D objects via picking, for which I prefer to use at least half-floats. Anyways, the format is not the problem, the multisampling is. I can easily afford storing the normals in RGB32F once, but I do not want to create a multisample layer with 8 samples.

You most certainly can not. The depth test requires a read/modify/write operation (when the test passes at least). And you can’t really do that from a fragment shader. Not within the same render call. Not even with atomic image load/store operations, because there is no ordering guarantee for fragment processing.

I haven’t thought this through.