According to the 1.2 specification clGetSupportedImageFormats(context…) yields “a union of image formats supported by all devices in the context”
Is there a way to determine what image formats are supported for each device without creating (and then destroying) a context for each device individually?
Alternatively, if there is a reason why this question is not well founded (e.g. there are interactions between devices that limit supported image formats) would someone please share a citation or an explanation?
Memory objects such as images are shared across the whole context, so it is not feasible to create an image just for one device. This is the likely reason why OpenCL doesn’t bother with reporting individual device capabilities.
Can you be more specific about the use case for this? If it is only a matter of curiosity, and not something which ends up on your performance-critical path, creating a simple program which spawns one context per device should be fine.
My understanding of memory objects is that they can be made available for computation on any device - so if an image is created in the CPU accessible memory it is copied/moved to the GPU accessible memory when a kernel is executed on the GPU. (This is likely a naïve/flawed model, so if you can recommend any writing on this topic I would be grateful.) With this model in mind I would expect to have to confirm image format support on a device before enqueueing a kernel.
The reason that I posted my question is this: with so much information available about each device, I was surprised that this information was not included and suspected that I must be missing something about how images are handled in a context. For example, it might be that the support of image format X on device A makes it possible (with minimal performance penalties) to run a kernel on device B using an image with format X, even though device B does not support format X. Since samplers provide a format-indepedendent interface this situation would make some sense…
That being said, my use case is to inspect a system when an application initializes and configure a context appropriately. Since I’m not expecting drastic & unanticipated changes it would be sufficient to create a preference file and only revise it if changes are detected.
AFAIK, the OpenCL spec purposely says very little about what happens on devices when memory objects are created, so as to allow the implementation to carry out optimizations. For example, the creation of an image object could asynchronously trigger the allocation of device memory and the configuration of one and more texturing units for sampling that image, in all devices within context. This way, when data has to be moved in, everything is already ready on the device side and so the data transfer/kernel enqueue can occur more quickly. If a device’s security policy dictates that memory should be zeroed out before use, this is another thing that could be done at this stage.
I don’t think that support of image format X on device A could enable support on device B, since image support is often at least partially implemented in hardware (at least on GPUs), which means that there are hardware constraints on which formats a device’s texturing units can read efficiently. Avoiding these constraints could only be done by emulating the image support on device B or transforming the image format on either device A or the host, both of which have nontrivial costs.
But if you really want to know about individual device capabilities from a context, one possibility would be this:
[ul][li]Query the list of devices in the context using clGetContextInfo[/li][li]Create one context for each of these devices[/li][li]Query the contexts to get individual device capabilities[/li][*]Possibly cache the results to avoid doing this repeatedly[/ul]