Hello All,
I am new to this OpenCL parallel processing. I want to do parallel processing of image processing algorithm which is running in CPU.
So I am confused between two terms,

(1) CL_DEVICE_MAX_WORK_GROUP_SIZE in clGetDeviceInfo which gives me 256
(2) CL_KERNEL_WORK_GROUP_SIZE in clGetKernelWorkGroupInfo which gives me 128

I have launch kernel with localWorkgroup = {16,16} and globalWorkGroup = {width,height}.
But it gives me an error of ‘CL_INVALID_WORK_GROUP_SIZE’.

After that i launched kernel with localWorkgroup = {11,11} and globalWorkGroup = {width,height} which worked for me.
So is it taking 2nd one CL_KERNEL_WORK_GROUP_SIZE to launch kernel.
After that i tried to change local_work_group_size from {12,12},{13,13,},{14,14,},{15,15,} but it gives me same error.

Your help much appreciated…
Thanks in advance…

CL_KERNEL_WORK_GROUP_SIZE takes precedence over CL_DEVICE_MAX_WORK_GROUP_SIZE because device info has no knowledge of the kernel resource footprints or kernel attributes that may specify the max or required work-group dimensions, but kernel info will have that information because at that point you compiled the kernel source or binary.

Thanks for reply Sean,

It is possible. So to test that " Is it really taking precedence? " i have made kernel code empty meaning just a kernel function nothing in it. And not a single allocation on the device.
So now CL_KERNEL_WORK_GROUP_SIZE has to gives me 256.
But it still giving me 128. And i can only launch kernel with localWorkgroup = {11,11}.

So what is “CL_DEVICE_MAX_WORK_GROUP_SIZE” information for? What it means.?

Mine Device support OpenCL 1.1 Embedded Profile as per “clGetDeviceInfo”. Is it because of this??

Thanks in advance…

There is no need to test the precedence as it is a requirement of the spec (clarified in the most recent 2.1 spec). There is no requirement that even with an empty kernel the two must be the same, but it sure would make sense. I would report your finding to the vendor and see what they have to say.

Thanks Sean,

I am working on Adreno 320 GPU. And it supports OpenCL 1.1 Embedded Profile.