extensive pbuffer use == system hangs (ATI)

Hi,

I have the problem that with extensive pbuffer and render_to_texture use my system hangs and I need to do a hw-reset (sometimes the VPU recovery can recovery the system).
I check for OpenGL Error and lost pbuffer but no problems there and the pbuffers all get created without problems.
Basically my pbuffer wrapper seem to be ok and simple examples work without problems. But with using several pbuffers (RGBA8 1024x1024 + two RGBA32 float 1024x1024) I get the system hangs always when rendering to the later pbuffers.
I use wglShareLists for all pbuffers and use glsl shader for rendering. wglShareList and wglMakeCurrent return without errors.

Till now it seem that the crash only happen with rendering to the later created pbuffers (2nd float pbuffer).
And depending on the pixel format of that 2nd pbuffer my system crashes with first few frames rendered to it, or only after I switch to taskmanager or other applications.

My first guess was that perhaps there was not enough graphic memory for the pbuffer or the buffer get lost during the application switch. But the pbuffer is created without errors and I check for lost pbuffer. And according ARB spec rendering to a lost pbuffer nevertheless should work without crashing.

I guess it’s a strange driver bug but I’m not sure what exactly causes the crash… Has anyone else also such system crashes with using several pbuffers?
Any ideas what can cause this crashes or how to find it out?

using: Radeon 9800 + catalyst 4.10

Is your application single-threaded or multi-threaded?

single-threaded

I have had similar problems to this (on a 9800 too). The only way I could get around it was to call glFinish at the end of every frame.

mhh, I already had a glFinish after my SwapBuffers. Putting it before doesn’t change anything. As I render lots of things before SwapBuffer I also tried to put a glFinish in the render loop where the crash happens, but it still crashes.

Ok, after some testing and countless reboots it seem to be a problem related to pbuffer/glsl.

One combination of pbuffer and a specific glsl shader crashes my system, if I use the same shader in other pbuffers or other shaders in the same pbuffer it works without problems :confused:
It seem somehow be related to vertex shader attributes, because if I modify the crashing shader and add another vertex attribute the shader don’t crash anymore. Or I can disable all the vertex arrays with glDisableVertexAttribArrayARB and it doesn’t crash.

So what can this mean? I’m totally out of ideas how to find and fix this bug. The strange thing is that till now this only happened with this special combination of specific pbuffer with specific glsl shader.

Are there any known problems with pbuffer/glsl on ati hw?
Any ideas what can cause this crashes?

Originally posted by zeckensack:
Is your application single-threaded or multi-threaded?
Are there any known bugs with respect to ATI/pbuffers/multithreading?

valoh, can you give the details on what specific pbuffers parameters and glsl program are causing these crashes ?

Originally posted by GKW:
[quote]Originally posted by zeckensack:
Is your application single-threaded or multi-threaded?
Are there any known bugs with respect to ATI/pbuffers/multithreading?
[/QUOTE]None that I’m aware of. But there are a few gotchas with Windows GDI and user functions in multi-threaded applications. Worst case is an application hang.

Originally posted by zeckensack:
None that I’m aware of. But there are a few gotchas with Windows GDI and user functions in multi-threaded applications. Worst case is an application hang.[/QB]
Can you elaborate on these gotchas? I have a very simple pbuffer app that works just fine using nvidia but fails with ATI. wglCreatePbufferARB returns false with an error of 0 on ATI cards. All the function inputs are valid and the context is current only on the calling thread.

Do you actually check what you get back after you create the pbuffer? Maybe you might get
back something that you didn’t request if it wasn’t possible. You can use wglQueryPbufferARB()
to get info about the pbuffer you just created.

Probably should also use WGL_PBUFFER_LOST_ARB to determine if your surface was lost during rendering too. Anything could happen to it after it gets created.

Originally posted by GKW:
Can you elaborate on these gotchas?
“The system creates a thread’s message queue when the thread makes its first call to one of the USER or GDI functions.”
Sometimes you don’t want that. The system may place messages in that queue for your thread to pick up. Problems can arise if you don’t regularly run a “message pump”, in that thread, i.e. the well known GetMessage/PeekMessage plus DispatchMessage stuff, because messages in the queue will not be processed.
At a minimum, you’ll have an extra “not responding” application listed in the task manager, for every thread that has called a GDI/user function and does not have the “pump”. But that’s harmless most of the time.

Worst case:
a)You start two threads.
b)You create a window in thread A, but don’t run a pump. The window will work fine. Resizing/moving/destroying/whatever will actually work without the pump.
c)You SendMessage(anything) to that window from thread B
The process will at that point hang because SendMessage is specified to not return before the message has been processed. But it will never be. The window procedure will not catch the message automatically. The system will only deliver it to a GetMessage or PeekMessage call.

And there are a lot of things you can only do with a window from the thread that created the window, setting active window status, keyboard/mouse focus, destroying the window etc. AttachThreadInput can relax some of these restrictions, but not all.

I actually tried to replace a remote DestroyWindow (which isn’t allowed) with WM_DESTROY and WM_NCDESTROY messages a few days ago, and ended up with a hung application instead of a memory leak. That’s why it was still on my mind.

Leaving the message pump running seems to have fixed my problem. I guess ATI does things a little differently than Nvidia. Thanks.

@zeckensack:

As I said single-threaded I meant that I use my OpenGL stuff within one thread. No different pbuffers or render contexts across several threads. But I use the Windows.Forms which I guess internally are multithreaded, I also link to multithreaded runtime libs.

The hangs/crashes are somehow non determininistic I’ve done lots of testing and can’t find the exact reason…

I don’t understand your gotcha explanation and how this relates to pbuffers. The problem first arises with the extensive use of pbuffers and there also only in some cases.
Could you perhaps elaborate a little bit more :wink: I don’t know much of this win32 stuff, with Windows.Forms you only have to create some event-handlers. The main questions I have are: how does this relate to pbuffers, how can I fix it? What does it mean (in source code) “Leaving the message pump running”?

Originally posted by valoh:
… using: Radeon 9800 + catalyst 4.1
Why not start with downloading new drivers … 4.9 are out.

Originally posted by 1024:
Why not start with downloading new drivers … 4.9 are out.[/QB]
there are already the 4.10 out, which I’m using if you would reread my post :wink:

Dooh! Weird - somehow i got the impression that the new drivers i downloaded were 4.9, but it appears to be 4.10. LOL. (did 4.1 actualy have glsl support?)

Originally posted by 1024:
Dooh! Weird - somehow i got the impression that the new drivers i downloaded were 4.9, but it appears to be 4.10. LOL. (did 4.1 actualy have glsl support?)
:smiley:

afaik there was already beta glsl support in catalyst 3.8 or 3.9, but it was really really beta/experimental shudder.

can anyone help me with the pbuffer crashes?

I didn’t exactly understand the explanation with threading and message pump and relation to pbuffers. But with some further testing I have now identified following driver crash behaviour:

it crashes only during a special pbuffer rendering during application start if I switch application. If I don’t switch it runs without crashing. As this is a somewhat longer preprocessing rendering it is really annoying not being able to switch to other apps now.
The strange thing is not every pbuffer rendering crashes the driver. For the preprocessing I use several pbuffers and only with one it crashes with application switches.
So its really a weird behaviour. And I’m really sure it is not a problem with my pbuffers or shaders as with other cases they work fine, I test for all error states and lost pbuffers, shader validation and so on, plus it only crashes with this application switches…

Could it be related to that thread message thingy? How could I fix it?

I am doing some offscreen rendering so I created a thread with a hidden window. I then set up a DC and a RC and grabbed the extension string for the driver. After that I figured that there would not be any user input so I freed the context I created and then explicitly paused that thread. I should have just let GetMessage run and since there is no user input it would in effect pause that thread. I discovered yesterday that it didn’t totally fix my problems but it is better.