I need some advice about buffer created from context with multi devices.
I guess it’s better (regarding performance) to use context with only one device, because when you’ll create a memory object from this context it will be specifically allocated for this device.
Is it right? and if yes is there really some situation in which it’s interesting to use context with several devices?
So the way I interpret OpenCL the buffer merely allocates the space needed on that set of devices. The memory is then initialized with clEnqueueWriteBuffer which takes a specific command queue (that can only correspond to one device). Therefore, a single context with a single buffer can be allocated, then multiple command queues are used to write different input data to each device. This allows the same kernel to be run across multiple devices, but with different input data. Depending on the other parameters to the kernel function you may or may not need a separate cl_kernel object for every device.
That’s correct. The reason for devices to share a context is so that OpenCL can arrange for them to share cl_mem objects. However, the efficiency with which cl_mem objects are allocated depends on the implementation. A simple implementation would allocate the memory on each device, but that would be foolish because it might only be used on one. The efficiency of this will depend on the implementation, but a decent implementation should give you similar performance either way.