ATI weirdness

ugluk · May 16, 2010, 4:02am

I am using the ATI 10.4 beta drivers for the Radeon HD 5450 on linux and if I want to use a shader program (successfully loaded, compiled & linked, no errors) I need to first call glValidateProgram(), otherwise I get segfault. Now, I don’t recall the specs requiring this and in fact on my NVIDIA card I can remove the commented out code without problems.

So is the call to glValidateProgram() an informal requirement? Is this weirdness a well known thing?


/*
  glValidateProgram(program);
  GL_DEBUG();

  GLint valid;

  glGetProgramiv(program, GL_VALIDATE_STATUS, &valid);
  GL_DEBUG();

  if (!valid)
  {
    throw std::runtime_error(gl_get_shader_error(program));
  }
  // else do nothing
*/

  glUseProgram(program);
  GL_DEBUG();

Heiko · May 16, 2010, 6:55am

I do not experience such odd behavior. Sure there isn’t some kind of memory leak elsewhere in your code that causes the segfault? Or some off-spec glsl(hlsl/cg) in your shader code?

I have quite some experience with Ati hardware (HD3200, HD4870, HD5770), shaders ranging from simple ones to very complex ones. I haven’t observed the behaviour you describe trying lots of different drivers in the past 6 months (both on Linux and Windows).

Any chance you can reproduce it in a minimal code example?

Dark_Photon · May 16, 2010, 12:21pm

Didn’t experience this on the latest ATI beta drivers, and you shouldn’t have to call glValidateProgram at all.

Unlike NV (AFAICT), ATI does actually validate the program against the existing GL state (e.g. bound texture types) which is the whole purpose of the API, so you do want to ensure that’s set up properly first.

Suggest you run a memory debugger on your app. Sounds like either you or the ATI driver (or both) have memory problems to find.

ugluk · May 16, 2010, 1:28pm

This is what gdb say in a backtrace. I have no idea why the basic_strambuf destructor is called after I call GLPrograma::use()

#0 0x0000003cf77c24c9 in ?? () from //usr/lib64/opengl/ati/lib/libGL.so.1
#1 0x0000000000000001 in ?? ()
#2 0x0000000000407db0 in std::basic_streambuf<char, std::char_traits<char> >::~basic_streambuf ()
#3 0x0000000000415f09 in GLPrograma::use (this=0x9e95c0) at utility/gl_programa.cpp:94
#4 0x0000000000408146 in main (argc=1, argv=0x7fffffffdca8) at test/test_array_storea.cpp:162

This backtrace is weird, also the app crashes as soon as it calls glUseProgram() at initialization, unless I validate the program first of course. So I do not have the chance to trash the memory a great deal. Could it be a GLEW problem, I wonder? I’ll try to produce a minimal example.

ugluk · May 16, 2010, 2:22pm

Here’s the minimum sample, do you recall the weird destructor call, shown in the backtrace? It’s from the GL_DEBUG() macro, that I expanded below in the quote. In my opinion, g++ instantiates some stuff on the stack upon entry into the use() method, and ATI fiddles with it which causes a crash somehow? If I remove the debugging stuff, glUseProgram() works just perfect.

In fact, if I put anything on the stack before calling glUseProgram() I get a crash. That is, any local variable causes a crash, even a bool.

glUseProgram(program);
// if I snip below all works fine
do {
GLenum error(glGetError());
if (GL_NO_ERROR != error)
{
std::cerr << gl_error_string(error) << std::endl;
BOOST_ASSERT(0);
}
} while(0);

Dark_Photon · May 16, 2010, 2:43pm

Sounds like you’re running Linux. Run your app under valgrind and it’ll usually point you directly to the error.

Before doing this, you can often get better stack traces by compiling with: gcc -O0 -g3 -ggdb3 -fno-inline …

ugluk · May 16, 2010, 3:24pm

This is what it valgrind said:

==19307== More than 10000000 total errors detected. I’m not reporting any more.
==19307== Final error counts will be inaccurate. Go fix your program!
==19307== Rerun with --error-limit=no to disable this cutoff. Note
==19307== that errors may occur in your program without prior warning from
==19307== Valgrind, because errors are no longer being displayed.
==19307==

But all the reported errors occured outside my app. How am I supposed to fix those errors?

Dark_Photon · May 16, 2010, 4:22pm

You’ll need to work the owner of the software they are in to get them fixed. If they are in the ATI driver, report them. These are beta drivers, and they do expect to get bug reports (with test programs to reproduce the bugs).

Heiko · May 17, 2010, 12:26am

Yup, that is indeed what valgrind does. And it does it for years already with Ati drivers. I’ve been told that the Ati drivers do some tricks with memory that memgrind does not understand. Looking at the valgrind webpage it does indeed say that it can generate false positives (although it should be rare).

system · October 19, 2021, 7:28pm

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.