How to create On Device Queue

I´m trying to create a command queue on device. I use this commands for create command queue.

cl_queue_properties proprt[] = { CL_QUEUE_PROPERTIES, CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE | CL_QUEUE_ON_DEVICE | CL_QUEUE_ON_DEVICE_DEFAULT, CL_QUEUE_SIZE, 100,0 }; //test proprt on calculation

gpuControlData->cmdQueue = clCreateCommandQueueWithProperties(gpuControlData->context, gpuControlData->device, proprt, &err);

CodeXL as debuger write me a problem about “A parameter is not an expected value.” and return code is “CL_INVALID_VALUE”.

I control size of memory for command queue on the device and I have 5MB, but I can not create queue on device.

Thanks for any help.

I´m sorry the value 5MB is wrong. I dont know how to get queue size.

The minimum value for CL_DEVICE_QUEUE_ON_DEVICE_ PREFERRED_SIZE is 16KB, so 100 bytes is probably too small a value to be accepted by clCreateCommandQueueWithProperties().

OK, I test size of memory for command queue on my devices. On my ATI device I have 0B for command queue. So I try it to another nVidia Geforce 940MX GPU and there is 64KB memory for command queue. But when I want to create commnad queue debuger write me error and program fall down.
Debuger write “Second Chance Exception” after clCreateCommandQueue command. The text of error is:

The thread tried to read or write data that is misaligned on hardware that does not provide alignment. For example, 16-bit values must be aligned on 2-byte boundaries; 32-bit values on 4-byte boundaries, and so on.

And I use this commands to create command queue:

typedef struct GPUControl {
cl_context context;
cl_command_queue cmdQueue;
cl_command_queue maskCorrelQueue;
cl_program program;
cl_kernel kernel;
cl_device_id device;

cl_command_queue_properties proprt1[] = { CL_QUEUE_PROPERTIES, CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE ,0 };
cl_command_queue_properties proprt2 = 0;
gpuControlData->cmdQueue = clCreateCommandQueueWithProperties(gpuControlData->context, gpuControlData->device, proprt1, &err);

I try for all three variables defined berfore but with out any different. This implementation work on my ATI GPU but problem is on nVidia GPU.

I find solution, nVidia GPU don´t have a command named clCreateCommandQueueWithProperties so there is only clCreateCommandQueue.
But using clCreateCommandQueue I don´t have to use properties for queue on device.

How to I implemet queue on device?

On-device command queues are an OpenCL 2 feature. OpenCL 2 is currently not supported by NVidia device drivers, because NVidia would rather have you write code in CUDA and lock yourself into their hardware.

I think you could save yourself from many headaches by using the OpenCL platform and device queries (clGetPlatformInfo / clGetDeviceInfo) in order to know which OpenCL version (CL_PLATFORM_VERSION) and features (read the spec for your OpenCL version) your hardware actually supports.