OpenCL Host code: Multiple elements of input vector processed by one work-item

I am having an OpenCL program whose kernel takes (_global int4* array) as an input. In the host code, I have a (std::vector<cl_int> vec) which will be passed to the kernel function for processing.
My question is: Is there a way to enqueue the kernel such that each work-item will process multiple elements from vec?
For instance:
If the input vector’s size is defined as 32 in the host code. So std::vector<cl_int> vec(32);
Is there a way to specify that I want to process 8 adjacent elements in vec in each instance of the kernel, as in the code below?

    kernel(_global int4* array){
         int i = get_global_id(0);
    //I want to use 8 elements that are processed in a work-item here
    //note that int8 = (int4, int4)
         int8 result = (int8) (array[i], array[i+1]);
    }

I am new to OpenCL so any help is deeply appreciated.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.