Shadow Mapping FPS Woes - Pos. ATI Driver issue

ah, that one also scores a massive 4fps… :

OK, so it’s a driver bug, all I wanted to known.

Try to contact ATI for support, and maybe it will be fixed in the future version of Catalyst.

Alternatively, you can rewrite the whole shadow mapping in OpenGL by :

  • Writing a vertex and fragment shader for shadow map generation (1st pass)

  • Writing a vertex and fragment shader for shadow map test (2nd pass). Also you could implement a 2x2 PCF at the same time.

  • using PBuffer (with floating point buffers for example).

yeah, i’ve already done the shaders method before now (packing into a RGBA texture via GLSL), but having a broken copy command is kinda annoying anyways :wink:

I’ll poke ATI devrel at some point (I’m tempted to after the next driver release, which is about a week or so away), although i wouldnt expect a fix soon due to the driver pipeline being rather deep :expressionless:

I’m still amused that something like this managed to make it into the drivers in the first place and that it doesnt appear to have been noticed before… :eek:

So, why don’t you use a
PBuffer ?

  • GL_RGB_FLOAT16_ATI.

(Sadly you can’t use the new EXT_framebuffer_object. It won’t be supported by ATI, hardware issue again, due to the render to the depth buffer which is not supported).

Then you won’t have to copy. Maybe the glCopyTexture is simply broken, so don’t use it.

It would be nice to have a kind of ‘certified OpenGL driver’, like ‘certified DirectX driver’ where drivers are tested throughly by SGI or an other big company. The drivers could pass some rendering tests, test all the extensions etc…

Finally it’s a bad news for me. I’ve implemented the shadow mapping using the ARB_shadow extension, i need to redo a completly new version using vertex / fragment shader, because apparently, it doesn’t work on X800.
:mad:

Originally posted by execom_rt:
Sadly you can’t use the new EXT_framebuffer_object. It won’t be supported by ATI, hardware issue again, due to the render to the depth buffer which is not supported
omg i hope you’re joking…
Are you sure they won’t support a partial or an emulated implementation?

Well it is not supported.

DST is not an official DX9 feature and ATI didn’t implemented it.

(remember the story with 3DMark’03 and DST support ? (google 3dmark DST, it was looking different, better in fact, on nVidia than on ATI)

  • Creating a DST in DirectX 9 is not possible : function returns ‘not available’.

  • There is no proprietary OpenGL extension that support it (not WGL_NV_render_depth_texture equivalent).

Finally, EXT_framebufferobject is not available.

I think the specification tell that you need to support GL_DEPTH_COMPONENT as render target, but this will fails.

So either ATI implements a ‘partial’ EXT_framebufferobject with just render to a texture, or choose not implement it at all (what a surprise, ATI was against the ‘ARBisation’ of GL_EXT_framebufferobject, now you guess why …

But with the ARB_fragment_program and GLSL, it is possible to write your own shadow mapping without using depth component texture (using ATI texture float).

Of course it will much slower that the nVidia render path, but it will be a ‘portable’ solution.

ATI had a long history of problems with ZBuffer : glReadPixel with GL_DEPTH_COMPONENT didn’t work well on Radeon 8500, principally due to the ‘HyperZ’ thing that compress the ZBuffer, so reading back the Z Buffer needs the hardware to decompress the HyperZ, but that operation seems buggy at some point.

I think that what’s happening is the HyperZ on X800 has changed again (now
HyperZ™ HD, before it was HyperZ III™+), so reading it back is even more slower.

Since no (?) OpenGL games is using that feature (DST), ATI didn’t bother to optimize that part and is focused on other problems.

“The framebuffer object extension is currently available in beta drivers from both NVIDIA and ATI. It will be fully supported on NV30 and R300 and later, and possibly on NV1x and R200 as well.”

and until someone from ATI says otherwise I’ll be beliving we’ll get it (even if it doesnt support rendering to depth directly, i’m not that bothered as I dont intend on using pure depth only rendering anyways for shadow mapping stuff so the minor cost of packing and depacking isnt an issue)

Since no (?) OpenGL games is using that feature (DST)
mine does, perhaps thats why ati hardware cant run it!

Originally posted by execom_rt:
(Sadly you can’t use the new EXT_framebuffer_object. It won’t be supported by ATI, hardware issue again, due to the render to the depth buffer which is not supported).
Exactly where did you get this information or are you just speculating? FBOs will be supported. I don’t know about what the plans are for depth-renderables, but it’s of course possible to implement by simply doing a copy under the hood.

Edit: And I don’t see anything in the spec that makes depth-renderables required for FBOs.
I’ll take a closer look at this issue tomorrow and report back what I find out.

I guess that the GL_EXT_framebuffer_object has some error handling if one try to create a ‘depth component’ render target.

At worst, it would return an ‘unsupported’ error code, which is fine. It won’t stop this extension to be shipped on ATI graphics board.

But what I’m wanted to say is that this extension is not shipping yet at the day of 05 of April 2005, so the immediate solution right now is using PBuffers, and using WGL_TYPE_RGBA_FLOAT_ATI for a 32bit floating texture and writing a shader that output the depth coordinates into that buffer), and not using glCopyTexImage2D at all.

And this is valid for everybody who tries to to implement shadow mapping in OpenGL. I guess ATI should write a ‘white paper’ of how to implement shadow mapping on their boards, and more generically how to implement shadow mapping in 2005 (using Pbuffer, 32 FP texture, implementing a 2x2 PCF in GLSL etc…).

The document ‘Shadow mapping in Today’s OpenGL Hardware’ is getting quite old. I’m telling that for everybody : Don’t read that document anymore, there is now better solution than this, which are far more portable.

For finishing, I’m linking that post

With an implementation of shadow mapping in GLSL. Nicely explained. Alas, it missing the depth map generation + a 2x2 PCF which are easily doable.

I will probably post of an implementation I’ve just done in GLSL in a near future.

Probably a demo using GL_EXT_framebuffer_object, GLSL, 32 FP texture, 2x2 PCF and why not, a blur effect on the texture would be a nice plan of demo for you Humus :wink:

heh, oddly enuff it was that tutorial you linked to above where i learnt how to do the whole shadow mapping lark.

As for FBO, I’m holding out some hope it’ll be in the next driver update (possibly the by the 7th, if not early next week by my guess), it really depends on the depth of ATI’s driver pipeline… if not, well its a wait until next month, but I’ve got plenty todo in the mean time anyways :slight_smile: (incase you are wondering, i very quickly developed a dislike for the whole pbuffer system, so i refuse to touch 'em with someone elses bargepole…)

hmmm 2x2PCF… hadnt thought about adding that… might look into how that works and see if i can come up with my own demo… might be handy for an up and coming project :wink:

For the PCF 2x2 You could write something like this :

  • Based from the source code of the link, just defined RcpSampleSize = 1/TEXTURE_SIZE (ie 1/256 for 256x256 map), and r the final value -
vec2 texelpos = projectiveBiased.xy / RcpSampleSize;
vec2 lerps = vec2(fract( texelpos.x ), fract(texelpos.y));
float z = projectiveBiased.z;
vec4 k;
k.x = texture2D( shadowMap, projectiveBiased.xy ).r < projectiveBiased.z ? 1.0 : 0.0;
k.y = texture2D( shadowMap, projectiveBiased.xy + vec2(RcpSampleSize, 0) ).r < projectiveBiased.z ? 1.0 : 0.0;
k.z = texture2D( shadowMap, projectiveBiased.xy + vec2(0, RcpSampleSize) ).r < projectiveBiased.z ? 1.0 : 0.0;
k.w = texture2D( shadowMap, projectiveBiased.xy + vec2(RcpSampleSize, RcpSampleSize) ).projectiveBiased.z  < z ? 1.0 : 0.0;
float r = mix( mix( k.x, k.y, lerps.x), mix( k.z, k.w, lerps.x ), lerps.y );

This code really ‘shines’ with Floating point texture.

Else, for generating the Z-Depth

 
// Vertex shader
void main(void)
{
   gl_Position = ftransform();
}

// Fragment shader
void main(void)
{
    gl_FragColor = vec4(gl_FragCoord.z);
}
 

Alas, it is slow on my machine: Reading gl_FragCoord.z silently forces to sw rendering (I hate when it does that). I guess there is a better method for this one.

Originally posted by execom_rt:
Alas, it is slow on my machine: Reading gl_FragCoord.z silently forces to sw rendering (I hate when it does that). I guess there is a better method for this one.
Using depth bias and reading gl_FragCoord.z will get you into software rendering. Without the depth bias it should work fine. You can solve the problem by putting the bias on the other pass instead.

Yes, removing the glEnable(GL_POLYGON_OFFSET_FILL); fixed it, it’s running at correct speed now.

Hi,

I noticed the same problem about 6 months ago when I tried to implement depth peeling on my X800XT PE. The problem occured only when I copied the depth buffer into a texture. The problem did not seem to exist on my 9600 pro.

I just put the project on the ice while waiting for ATI to accept my registration in their developer program so I can report bugs and get access to more information. I guess they forgot all about me. Well, I will try to register for the third time in 6 months.

I will be watching this thread closely to see if anyone comes up with a solution.

Cheers!

Just for the record, the OGL driver with the Cat5.4’s has the same problem still (and no FBO extension either, so thats another month to wait… )

Speaking of Shadow mapping, does this demo run on ATi boards?
http://www.realityflux.com/abba/C++/Point%20Shadow%20Maps/PointShadowMaps.zip
http://www.realityflux.com/abba/C++/Point%20Shadow%20Maps/PointShadowMaps.jpg

Originally posted by Java Cool Dude:
Speaking of Shadow mapping, does this demo run on ATi boards?
http://www.realityflux.com/abba/C++/Point%20Shadow%20Maps/PointShadowMaps.zip
http://www.realityflux.com/abba/C++/Point%20Shadow%20Maps/PointShadowMaps.jpg

yep it does (radeon 9700 cat 5.1)

Originally posted by Java Cool Dude:
Speaking of Shadow mapping, does this demo run on ATi boards?
http://www.realityflux.com/abba/C++/Point%20Shadow%20Maps/PointShadowMaps.zip
http://www.realityflux.com/abba/C++/Point%20Shadow%20Maps/PointShadowMaps.jpg

I get an ‘<X>Unable to create pbuffer’ error in the log file.

I have a GF5900 Ultra 75.90.

Originally posted by Adrian:
[b] [quote]Originally posted by Java Cool Dude:
Speaking of Shadow mapping, does this demo run on ATi boards?
http://www.realityflux.com/abba/C++/Point%20Shadow%20Maps/PointShadowMaps.zip
http://www.realityflux.com/abba/C++/Point%20Shadow%20Maps/PointShadowMaps.jpg

I get an ‘<X>Unable to create pbuffer’ error in the log file.

I have a GF5900 Ultra 75.90.[/b][/QUOTE]Just figured I’d pass a long a little info that we found when running this app. On the FX5900, we get the same error. But this is due to an app bug. For some reason, the program is passing an invalid pixetFormat value to wglCreatePbufferARB() when run on an nv35. The value is very large, much larger than the last index of the last pixel format in the list. And so, wglCreatePbufferARB correctly fails to create the pbuffer with this invalid format.

Thanks!

-B