creating objects in shared memory

Hi all,
I need to create an array in shared memory so that it could be accessed by all my program threads and they could perform computations on it or search for some value in it…
how can I do this?
I thought of something like writing a kernel function that creates initializes this array in the shared memory (__local)… then calling my main kernel function (another kernel) afterwards that would work on this array… is this feasible? if so how can I do this?

I thought of something like writing a kernel function that creates initializes this array in the shared memory (__local)… then calling my main kernel function (another kernel) afterwards that would work on this array… is this feasible?

__local memory is not persistent across multiple calls to clEnqueueNDRangeKernel(), so it’s not possible to initialize __local memory with one NDRange and then perform some computation with another NDRange.

However, what you can do is create a kernel function that calls into two functions: the first one to initialize the __local storage and the second one to perform the actual computation.

It would look something like this:


void initLocalMem(__local float* localMem, int localMemSize)
{
    // Initialize local memory here.
    // Remember to call "barrier(CLK_LOCAL_MEM_FENCE)"
    // at the end to ensure that all work-items have finished initializing
    // the local memory
}

void compute(__local float* localMem, int localMemSize, __global float *in, __global float *out)
{
    // Process data here
}

__kernel void doSomething(__local float* localMem, int localMemSize, __global float *in, __global float *out)
{
    initLocalMem(localMem, localMemSize);

    compute(localMem, localMemSize, in, out);
}

Hi.

This tutorial may help you understand how to use shared memories.

http://gpgpu-computing4.blogspot.com/2009/10/matrix-multiplication-3-opencl.html

Thanks a lot david.garcia and wonwoolee :slight_smile: