Strange segfault error when calling glDrawArrays

Hi,
so it took much longer than expected but after much wailing and gnashing of teeth, I managed to create a trimmed version of the code. I learned a lot about the code base and also about how it is using opengl but I still couldn’t find the bug. I also learned a lot about how to create a stubbed version of c code, a new life skill! Initially I just output the vertices and drew them in a demo program but that worked and told me nothing about the bug. Instead I started with the code base and then skeletonised it down to a single instance where I could demo the bug. I left in the camera controls so that it is possible to look around the scene:
https://dl.dropboxusercontent.com/u/3440275/trimmed.zip
once the code is built you can see it working by running:
./build/viewer
if you press s, the camera moves backwards and you can see the axes.
if you run:
./build/viewer -b test.bin
it will open a mesh which crashes on my machine but runs on every other machine I have tested it.

Again thanks for all your help so far, it has been a good learning experience. I don’t expect you to debug the code, just letting me know if you can recreate the bug would be helpful.

[QUOTE=jonathanbyrn;1285530]I managed to create a trimmed version of the code. …:
https://dl.dropboxusercontent.com/u/3440275/trimmed.zip[/QUOTE]

Great job! I had no problem fetching, building, and running it here.

once the code is built you can see it working by running:
./build/viewer
if you press s, the camera moves backwards and you can see the axes.

Works fine here.

if you run:
./build/viewer -b test.bin
it will open a mesh which crashes on my machine but runs on every other machine I have tested it.

I don’t expect you to debug the code, just letting me know if you can recreate the bug would be helpful.

Good news. It crashes on startup here as well. So it’s not just something odd about your machine’s setup.

Details:

  • GPU: NVidia GeForce GTX 760
  • GPU Driver: NVidia 370.28

To provide you more info, I went back, compiled the app with “-g” (adds debugging info), and ran it under valgrind’s memcheck tool. Here’s what I see:


==6514== Memcheck, a memory error detector
==6514== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==6514== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==6514== Command: build/viewer -b test.bin
==6514== 
==6514== Conditional jump or move depends on uninitialised value(s)
==6514==    at 0x4E43356: svo_set(svo*, unsigned long, unsigned long, unsigned long) (svo.c:682)
==6514==    by 0x4E4287B: svo_aset(svo_acc*, unsigned long, unsigned long, unsigned long) (svo.c:423)
==6514==    by 0x4E39E54: binvox(svo_acc*, unsigned char*, unsigned long, float*) (files.c:201)
==6514==    by 0x4E39811: loadBinvox(svo*, char const*, float*) (files.c:59)
==6514==    by 0x402CB8: main (viewer.c:263)
==6514== 
==6514== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==6514==    at 0x5792060: __sendmsg_nocancel (in /lib64/libpthread-2.18.so)
==6514==    by 0x97B65E6: ??? (in /usr/lib64/libGLX_nvidia.so.370.28)
==6514==    by 0x97B2448: ??? (in /usr/lib64/libGLX_nvidia.so.370.28)
==6514==    by 0x974AB0D: ??? (in /usr/lib64/libGLX_nvidia.so.370.28)
==6514==    by 0xAC50165: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0xAC48903: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0xAC4A562: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0xAC4AD37: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0x979F8D9: ??? (in /usr/lib64/libGLX_nvidia.so.370.28)
==6514==    by 0xAC46EC7: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0xAC47B08: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0x974ADB5: ??? (in /usr/lib64/libGLX_nvidia.so.370.28)
==6514==  Address 0x7feffdfdc is on thread 1's stack
==6514== 
Renderer
Volume dimensions: 64 * 64 * 64
Volume size = 10.03 KiB
Extracting mesh... ==6514== Thread 3:
==6514== Conditional jump or move depends on uninitialised value(s)
==6514==    at 0x4E42326: svo_aget_colour(svo_acc*, unsigned long, unsigned long, unsigned long, Colour*) (svo.c:284)
==6514==    by 0x4E424E6: FEMAGetSurroundingVoxels(svo_acc*, unsigned long, unsigned long, unsigned long, Voxel*) (svo.c:322)
==6514==    by 0x4E3CF11: extractNoColour(svo_acc*, Region*) (render.c:687)
==6514==    by 0x4E3E3D7: getMesh(void*) (render.c:1063)
==6514==    by 0x578B0DA: start_thread (in /lib64/libpthread-2.18.so)
==6514==    by 0x62A9E3C: clone (in /lib64/libc-2.18.so)
==6514== 
==6514== Thread 2:
==6514== Conditional jump or move depends on uninitialised value(s)
==6514==    at 0x4E42334: svo_aget_colour(svo_acc*, unsigned long, unsigned long, unsigned long, Colour*) (svo.c:284)
==6514==    by 0x4E424E6: FEMAGetSurroundingVoxels(svo_acc*, unsigned long, unsigned long, unsigned long, Voxel*) (svo.c:322)
==6514==    by 0x4E3CF11: extractNoColour(svo_acc*, Region*) (render.c:687)
==6514==    by 0x4E3E3D7: getMesh(void*) (render.c:1063)
==6514==    by 0x578B0DA: start_thread (in /lib64/libpthread-2.18.so)
==6514==    by 0x62A9E3C: clone (in /lib64/libc-2.18.so)
==6514== 
==6514== Conditional jump or move depends on uninitialised value(s)
==6514==    at 0x4E42342: svo_aget_colour(svo_acc*, unsigned long, unsigned long, unsigned long, Colour*) (svo.c:284)
==6514==    by 0x4E424E6: FEMAGetSurroundingVoxels(svo_acc*, unsigned long, unsigned long, unsigned long, Voxel*) (svo.c:322)
==6514==    by 0x4E3CF11: extractNoColour(svo_acc*, Region*) (render.c:687)
==6514==    by 0x4E3E3D7: getMesh(void*) (render.c:1063)
==6514==    by 0x578B0DA: start_thread (in /lib64/libpthread-2.18.so)
==6514==    by 0x62A9E3C: clone (in /lib64/libc-2.18.so)
==6514== 
==6514== Thread 1:
==6514== Invalid read of size 4
==6514==    at 0x405FF4A: ??? (in /var/tmp/.glqOjG41 (deleted))
==6514==    by 0xAE01BB3: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0xAE06DC7: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0xA9E81C7: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0x403CCF: main (viewer.c:501)
==6514==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==6514== 
==6514== 
==6514== Process terminating with default action of signal 11 (SIGSEGV)
==6514==  Access not within mapped region at address 0x0
==6514==    at 0x405FF4A: ??? (in /var/tmp/.glqOjG41 (deleted))
==6514==    by 0xAE01BB3: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0xAE06DC7: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0xA9E81C7: ??? (in /usr/lib64/libnvidia-glcore.so.370.28)
==6514==    by 0x403CCF: main (viewer.c:501)
==6514==  If you believe this happened as a result of a stack
==6514==  overflow in your program's main thread (unlikely but
==6514==  possible), you can try to increase the size of the
==6514==  main thread stack using the --main-stacksize= flag.
==6514==  The main thread stack size used in this run was 8388608.
finished in 6.98 seconds
Voxel mesh size = 0.49 MiB
Controls:
    Move camera: w, a, s, d
    Rotate camera: h, j, k, l
    Toggle grid: o
    Toggle axes: x
    Toggle wireframe: v
    Toggle lighting: n
    Toggle level of detail: t
    Change level of detail: 0-9
    Toggle flashlight: g
    Exit: q
==6514== 
==6514== HEAP SUMMARY:
==6514==     in use at exit: 6,067,577 bytes in 8,628 blocks
==6514==   total heap usage: 16,868 allocs, 8,240 frees, 226,016,506 bytes allocated
==6514== 
==6514== LEAK SUMMARY:
==6514==    definitely lost: 488 bytes in 8 blocks
==6514==    indirectly lost: 201,120 bytes in 8 blocks
==6514==      possibly lost: 1,331,997 bytes in 3,266 blocks
==6514==    still reachable: 4,533,972 bytes in 5,346 blocks
==6514==         suppressed: 0 bytes in 0 blocks
==6514== Rerun with --leak-check=full to see details of leaked memory
==6514== 
==6514== For counts of detected and suppressed errors, rerun with: -v
==6514== Use --track-origins=yes to see where uninitialised values come from
==6514== ERROR SUMMARY: 1981 errors from 6 contexts (suppressed: 2 from 2)

As you can see, there are some uninitialized memory reads in your code, and then a bad read of address 0x0 (a NULL pointer) down in the NVidia driver inside the glDrawArrays() for your mesh (“Invalid read of size 4”), just as you said was happening. My suspicion is your app is providing this NULL pointer to GL to read, but I’ll look into this as time permits.

Ok, it looks like you need to do some length checking on your mesh vertex arrays to make sure they provide sufficient bytes to satisfy all vertices in the draw call.

Here’s a hack (not a fix) to your program that will reveal one problem with it (the one that was causing the crash in the driver on the voxel mesh glDrawArrays draw call) and get it to at least come up:

In GLMesh(), make this change:


    < if (mesh.colours.array) {
    ---
    > if (mesh.colours.array && mesh.colours.length) {

You’ll notice that the colors array has a non-null pointer but a 0 length. Oops. With the way your code is written, that ends up allocating a VBO containing 0 bytes, and the subsequent draw call instructs OpenGL (the GPU+driver) to go romping off the end of that 0-byte VBO to fetch colors for each vertex in that draw call.

[QUOTE=Dark Photon;1285542]Ok, it looks like you need to do some length checking on your mesh vertex arrays to make sure they provide sufficient bytes to satisfy all vertices in the draw call.

Here’s a hack (not a fix) to your program that will reveal one problem with it (the one that was causing the crash in the driver on the voxel mesh glDrawArrays draw call) and get it to at least come up:

In GLMesh(), make this change:


    < if (mesh.colours.array) {
    ---
    > if (mesh.colours.array && mesh.colours.length) {

You’ll notice that the colors array has a non-null pointer but a 0 length. Oops. With the way your code is written, that ends up allocating a VBO containing 0 bytes, and the subsequent draw call instructs OpenGL (the GPU+driver) to go romping off the end of that 0-byte VBO to fetch colors for each vertex in that draw call.[/QUOTE]

You hero! I was sure it had something to do with the vertices, I was looking at the wrong array. Well now I have enough info to work out what is going wrong. This has been one of the most helpful forums I have been on. When I started on this forum I had no tools to debug an openGL that was happening at runtime, now I have a couple of useful tools and approaches under my belt.
Thanks again for going above and beyond the call of duty!

Sure thing! Good luck with your project.