ATIs new normal-map compression

Hi there

I couldn´t find any spec for ATIs new compression technology.

The thing i actually like to know is, if it will be available for other cards than X800, too, or if it needs special hardware support.

Also another thing: AFAIK it compresses only 2 component normal maps. That means, that i have to reconstruct the third component in the shader.

  1. Are there any tools, which create 2-component normal maps? Or do i have to create my own?

  2. To compute the third component, a squareroot is necessary. Are gfx cards faster at this, than CPUs? If not, doesn´t that mean, that that extension brings us even faster into a shader bound situation?

Thanks,
Jan.

FWIW, it seems that this compression method is not yet available in OpenGL, only in D3D.

Humus seems to have been playing with it, and made a tool available, look at : http://esprit.campus.luth.se/~humus/comments.php?newsID=113

Well, ATI showed off their invention with Screenshots from Serious Sam 2. Well and Serious Sam 1 uses OpenGL, so i was thinking SS 2 does too, so ATI should have the extension working.

Jan.

Originally posted by Jan:
The thing i actually like to know is, if it will be available for other cards than X800, too, or if it needs special hardware support.
It needs special hardware support, which is currently only available on the X800.

Originally posted by Jan:
1. Are there any tools, which create 2-component normal maps? Or do i have to create my own?
I have posted one on my website as mentioned by Nicolas Lelong:
http://esprit.campus.luth.se/~humus/?page=Cool

Originally posted by Jan:
2. To compute the third component, a squareroot is necessary. Are gfx cards faster at this, than CPUs? If not, doesn´t that mean, that that extension brings us even faster into a shader bound situation?
GPUs are fast at this. A reverse sqrt is typically single cycle. To get a sqrt you multiply with the original value, so that’s another cycle. Even if you need to reconstruct the third component you must also take into account that you typically would want to normalize the bumpmap anyway, so it costs three instructions either way. The only difference really is that the DP3 instruction in the normalize is changed to a DP2ADD.