EGL error on cluster but works fine on local PC

Hi, all,

I was using a EGL depended code base for my experiment. (Where EGL is used to render a 3D simulator) I used docker to create an image locally, upload to docker hub, and use singularity to pull it to my cluster. Somehow, it works fine locally but having RuntimeError: EGL error 0x3002 at eglInitialize on cluster. I have no knowledge about EGL so I will post the code base I was using GitHub - peteanderson80/Matterport3DSimulator: AI Research Platform for Reinforcement Learning from Real Panoramic Images. , the EGL related code is in ./src/lib/MatterSim.cpp and ./src/lib/NavGraph.cpp. My dockerfile is modifed based on the docker file in the repo root. where. I bascially
do

mkdir build && cd build
cmake -DEGL_RENDERING=ON ..
make
cd ../

inside dockerfile.
I am very confused by different behavior with my local and cluster trial, especially I am using container which should not have this kind of problem…
I guess my problem description could be vague. Sorry about that. Any clue is helpful.

From egl.h:

#define EGL_BAD_ACCESS                    0x3002

And from the EGL spec:

Apparently, this section of init code is failing on your singularity cluster setup:

        // Initialize EGL
        eglDpy = eglGetDisplay(EGL_DEFAULT_DISPLAY);
        assertEGLError("eglGetDisplay");

        EGLint major, minor;

        eglInitialize(eglDpy, &major, &minor);
        assertEGLError("eglInitialize");

I would modify this section to explicitly check the return from eglGetDisplay() to ensure that it is not NULL (aka EGL_NO_DISPLAY). Note that this API call doesn’t register EGL errors. I would also explicitly check the return from eglInitialize() so you can see what’s going on more clearly. For instance, replace that section of code with the following and retry your cluster test:

        // Initialize EGL
        eglDpy = eglGetDisplay(EGL_DEFAULT_DISPLAY);
        if ( eglDpy == EGL_NO_DISPLAY )
        {
            fprintf( stderr, "ERROR: eglGetDisplay() returned NULL!\n" );
            exit(1);
        }

        EGLBoolean success = eglInitialize(eglDpy, &major, &minor);
        if ( !success )
        {
            fprintf( stderr, "ERROR: eglGetDisplay() failed!  eglGetError() = 0x%x\n", eglGetError() );
            exit(1);
        }

Thanks a lot!!! I will try it on my cluster. Will get back later (Don’t have root in my cluster to install dependencies so I need to build image locally, upload, which takes a long time…)