I was puzzled by a bug in my opencl program for a week, and now I narrowed it down to the problem with sub buffer creation.
Basically in my program, the main buffer is calculated and filled in by a kernel, and the program is all correct till this point. However, the subsequent kernel requires to read parts of this main buffer, and I decided to do that with a set of sub buffers referring to this main one. Here is where the problem comes. It appears that when I create the sub buffer, the part of the memory that covered by this sub buffer is been re-initialized/reset (more like reset, numbers inside become random).
Is this the intended behavior? This problem only occurs with NVIDIA hardware (every single one I tested) though, I tested the program on AMD, and the program runs fine and generated correct results. I also spend 10 hours running the program through oclgrind, no error was reported, and the program also yielded the correct results.
Now I have replaced the sub buffer creation with a regular buffer and a copy from buffer to buffer, and the program runs fine now. But I would love to not do that.
Thanks for any input.