Strange Issues

Hi,

I’ve been doing a lot of work on the inside of my rendering library. Having finished, I noticed that some code that uses it crashes at random times on simple functions–functions like glCallList, glDisable, glUniform1f, etc.

In my limited tests, calling the same function immediately after catching and ignoring the crash works fine, but then subsequent OpenGL calls fail.

There’s no repeatability–sometimes a program will works fine for the duration of a test. Sometimes, starting the exact same program results in a crash after a few seconds.

The problems I’ve noticed occur in a shadowmapping demo and in a GPU cloth collision detection demo I wrote. In the former, I noticed that disabling materials within the object’s display list seems to fix the problem–but after a thorough search of the materials code, there’s nothing wrong–perhaps materials trigger the effects of a problem elsewhere?

Strange other things are happening too: one glPopMatrix() did not throw a stack underflow error, even though is had no corresponding glPushMatrix(). Removing the call resulted in no change.

I’m not using threads, and I already updated my card’s driver to the latest version, with no effect.

I’m guessing all the problems have to do with an bind to a buffer not being unbound later, or something small like that, but because the problems are irreproducable, and come from what appears to be perfectly sound code, I’m at a loss for what to do!

I don’t fancy redoing all my work. Any possible ideas for what could possibly be going wrong? Any at all? I don’t need an exact solution–the library, although concise, is still enormous–searching it by hand is impossible. I just need to know what sort of situations to look for–what causes these sort of errors?

I’ve already asked IRC, GameDev, and several other sources. Please help! Anything!

Thanks,
Geometrian

This is a very general question. I don’t really see how anyone can help you much apart from pointing out the obvious stuff to try and eliminate things…

Have you tried running GPU intensive demos and benchmark programs on your system that rely on proven and tested code? This would possibly eliminate any fault or problem with your GPU.

Conversely have you tried running your library on other hardware? Does it exhibit the same problems?

If you get problems running other tried and tested code then you may have a bad GPU - although this seems unlikely.

If your code crashes on other hardware then the problem is obviously with your code.

At that point you need to test individual modules, slowly adding more complexity in with other features until you get a reproducible crash and then work from there to find the issue / issues.

I don’t know what other advice people can give you without you supplying more specifics yourself. :slight_smile:

You might consider using / investing in an OpenGL Profiler which will break on OpenGL errors in your code and will probably at least point you to the cause of the crash, which in turn may lead you to the root of the problem.