You’re right. I was looking at the OpenGL4.4 spec.
Things changed between OpenGL4.4 to OpenGL4.5.
Yes, specific language about the conversion was added. But it did not change the overall meaning; or at least, not in the way you think it did.
The exact wording from GL 4.4 is that “f’ 0 is then cast to an unsigned binary integer value with exactly b bits”. The fact that it says “cast” says nothing about rounding vs. clamping. So 4.4 guarantees you nothing about how the conversion is done.
Really, all the GL 4.5 wording does is clean things up a bit. By declaring the conversion to be a “function” of sorts, it implicitly means that the function must behave identically everywhere. The 4.4 wording of “cast” did not make it clear that the cast operation had to work the same everwhere. The 4.5 wording also makes it clear that the only viable values you could get are the two nearest f’. The wording of “cast” didn’t make that explicit.
So no, you should not assume that pre-4.5 implementations always round down.
If you really, truly, absolutely need to ensure rounding behavior, then you need to do the normalization yourself. That means using integer texture formats and doing the normalization as you output it, and when you read it in your shader.
This is not true. It is significant. With a deferred renderer it is possible to be rendering many light sources. ( eg, not uncommon to be rendering into the 100’s if VPL methods are being used ). So this means a value in the GBuffer can be read, lit and accumulated many times during the course of rendering a frame.
Yes, deferred rendering does accumulation. That doesn’t mean that the difference will be significant.
Take the albedo as an example. The maximum possible error for an 8-bit unsigned normalized value if it was clamped is 1/255 or ~0.004. So let’s say that the red channel is off by exactly that: ~0.004. So the computed value was actually 0.004 higher than the value stored value. Now, let’s say there are 1000 lights in the scene, and the light intensity that reaches the point, for all of them, is 1.0 for them. And let’s just multiply the light intensity directly to the albedo, just to make the math as simple as possible.
So, after doing all of the multiplications and additions, the total lighting intensity you get will be 4 units lower than the lighting intensity you should have computed. That sounds significant. But what do you do next?
Well, having done all of this lighting, you now need to employ tone mapping, since the maximum intensity from 1000 lights could be up to 1000 units. If we do tone mapping by a simple division by the maximum intensity, then we divide by 1000, leaving us with an error of… 0.004. If you allow for 10x overbrightening at this point (maximum intensity is 100), then the tone mapping will cause the error to still be only 0.04. You would have to use lower maximum intensities to make the error get to even 5% of the maximum value.
Even if you use a different tone mapping algorithm, the error gets tone-mapped right along with it. And thus, while the error will seem to have a high absolute value, after tone-mapping, the significance of it will be proportional to your tone mapping.
Oh, and let’s not forget that 0.004 is the maximum error; the average error will be 0.5/255 or ~0.002. So it will be even less significant.
To be sure, other kinds of terms can have different error characteristics. If you’re using an exponential specular “shininess”, obviously the error will vary with the exponent. But generally speaking, no matter how many lights accumulate into the value, tone mapping will render the absolute error relatively insignificant.
If you want evidence of this, just try it. Do what I suggested about normalizing the values manually. See whether you can tell the difference between flooring and rounding normalization, in a scene with hundreds of lights.
As long as you are maintaining precision during your lighting passes, then you should be fine.
Well… with the number of passes we’re doing I have noticed a performance penalty with halfFloats and uint16. This is why I’m experimenting with different storage mechanisms for our parameters (which is what prompted this post to start with ).
You could also use GL_RGB10_A2. That makes the average error even more insignificant: 0.5/1024 or ~0.0005. Or you can use GL_RG16F if you have a few values that truly need higher-precision; you shouldn’t lose much performance from using that. Even GL_R11F_G11F_B10F is an option, if the values have a large(ish) range, but don’t need a lot of mantissa precision.
But again, you shouldn’t be trying these things unless you can actually see the difference in precision, not just if you think it could be a problem.
Another option is that I use an integer-based render target ( eg GL_R32UI ) and perform the packing myself in the shader.
But this prevents me from performing any filtered sampling…
You should not be doing filtering in your lighting passes. The neighboring texels in your map do not necessarily correspond to neighboring fragments from the same object. So blending with them is decidedly inappropriate.