I’m developing an OpenCL program on my Intel HD Graphics 4000 GPU and I have met a really weird problem.
I have several arrays defined as:
unsigned short *a = new unsigned short[SIZE];
unsigned short *b = new unsigned short[SIZE];
(SIZE here is 2000)
Then, I tried to create buffer for these arrays as:
cl_mem memA=clCreateBuffer(context, CL_MEM_READ_WRITE|CL_MEM_USE_HOST_PTR, SIZEsizeof(a), a, &status);
cl_mem memB=clCreateBuffer(context, CL_MEM_READ_WRITE|CL_MEM_USE_HOST_PTR, SIZEsizeof(b), b, &status);
However, during the implementation, the kernel crashes with error code -5 ⇒CL_OUT_OF_RESOURCES
I’ve checked my device information with clGetDeviceInfo API and found the memory size is about 1.5G(Am I misunderstanding something?)
Since the data size of the array is way below the limit,and I didn’t specify the local work size, I don’t understand what’s happening.
And, the weirdest thing is that if I define larger arrays like:
unsigned short *a = new unsigned short[SIZE_L]; ←say 4000
unsigned short *b = new unsigned short[SIZE_L];
while I still only create buffer of SIZE(which is 2000), it all worked!
How can it be? It’s driving me crazy. Any help will be appreciated, thank you.
EDIT:just to provide some additional information. I intended to process a whole line of my image in the way of dividing it into several chunks of data. So, the kernel implementation is in a for loop. (does it matter?)