OpenCL C++ Bindings

I’ve asked internally if we can open up the entire Khronos Registry area to public RO Subversion access, like we already have with the OpenGL Registry. It will take a few weeks to get people inside Khronos engaged on the discussion and come to a conclusion. I’m hopeful of a yes answer towards the end of September.

Great job! However, Program::getInfo() does not work (e.g., CL_PROGRAM_BINARY_SIZES) if the context has multiple devices.

I’ve asked internally if we can open up the entire Khronos Registry area to public RO Subversion access, like we already have with the OpenGL Registry. It will take a few weeks to get people inside Khronos engaged on the discussion and come to a conclusion. I’m hopeful of a yes answer towards the end of September.[/quote]

I’m looking forward to this. I would like to start a project that is SWIG bindings around the C++ bindings. Then you can theoretically use OpenCL from any language SWIG supports, Python, Java, CSharp, etc. We have an extensive amount of SWIG experience we can contribute.

Hi,

can’t find any methods for creating a buffer from an opengl renderbuffer or texture. Don’t they exist, or didn’t i find them?
If they don’t exist, when will they be added.

They can be found at:
http://www.khronos.org/registry/cl/
in particular:
http://www.khronos.org/registry/cl/api/1.0/cl_gl.h
http://www.khronos.org/registry/cl/extensions/khr/cl_khr_gl_sharing.txt

But you may still need to wait a while longer for NVidia/ATI to implement these fully.

there i found them. i ment i didn’t find them in the c+±bindings.
Will they be added in near future?
how can i look up, what nvidia/ati drivers support until now? (e.g. where can i see which opencl-extensions they support)

You can query a device to see what extensions it supports. If cl/gl sharing is not listed, then it won’t support those extensions. I know those are supported on the Mac, but I’m not sure about AMD/Nvidia.

Any update on this, being end of October now?

First of all, thanks for the C++ bindings. Being new to the bindings, I was looking at Platform::get() and assumed it supported to get NULL passed in order to query just the number of available platforms. But when passing NULL, it crashes, because the pointer is not checked. It’s probably not very useful to query just the number of platforms after all, but in that case I’d recommend to make the argument a reference instead of a pointer to emphasize the argument is mandatory and NULL is not a valid value. In general, I highly recommend to “Use references when you can, and pointers when you have to.”, see [1].

[1] “When should I use references, and when should I use pointers?”
http://www.parashift.com/c+±faq-lite/r … ml#faq-8.6

BTW, I’m aware now that my assumption was complete nonsense, as get() does not return the number of available platforms, but an int to indicate the success of the call. But due to the lacking docs for the static get() method I did not realize this until looking at the source code =:-)

It is a good point about the pointers over references. The reasons pointers are used in certain places is that they map directly to where this are map as pointers in the underlying C model. In general, I agree that references should be used but I do not particularly like the fact that C++ allows something to be side-effected without indicating it on the side of the call. I understand the argument about the NULL pointer and will think more about this.

We have C++ bindings for the GL extensions that we will be releasing in the near future.

I assume you mean CL extensions?

cl_extension for oepngl-context sharing :wink:

can you announce it here, when you release them?

thanks

Yes I did mean CL :slight_smile:

I will be sure to comment here when they go up.

Hi Ben, I’m really enjoying reading the C++ bindings header - great job!

I’m also trying to learn from it how to use the C API properly. One thing I notice is that in cl::Platform::get() you call clGetDeviceIDs twice: first to get the number of available platforms, and then to read them. The first call has the following parameters: (num_entries = 0, *platforms = NULL, *num_platforms = &n). According to Section 4.1 of the standard:

"The number of OpenCL platforms returned is the mininum of the value specified by num_entries or the number of OpenCL platforms available.

But if the number of returned platforms is the minimum, then it should be 0, not 1 as returned both by AMD’s and NVIDIA’s implementations. So where’s the bug – in the standard or in the code? :slight_smile:

I guess the bug is with you :slight_smile: The “number of OpenCL platforms returned” refers to the number of platforms written to the “platforms” pointer, not to the integer returned in “num_platforms”.

eyebex, you are right. But still there is a bug - a typo in the word “mininum” :).

P.S. I don’t like when an API call can serve two separate functions: in this case, to get the number of available platforms and to get the platforms proper. But I see how it reduces the total number of API functions, which is a good thing.

I have recently started using OpenCL via C++ binding, which are of great help. I have been trying to pass int2 arguments to the kernel and found that a most obvious way to do it does not work with the C++ bindings.

const char* OpenCLSource =
 "__kernel void test1(__global uint* out, int2 a){"
 "  unsigned int i = get_global_id(0);"
 "  out[i] = a[i%2];}"
int main(void) {
  cl_int2  a = {0,1};
  const int sz = 128;
  cl::Buffer _out( context,CL_MEM_WRITE_ONLY, sz*sizeof(unsigned int) );
//Setup code omitted for brevity..............
    cl::KernelFunctor func = kernel.bind(queue,cl::NDRange(sz),cl::NDRange()   );
    func(_out, a).wait(); //Generates incorrect calls to C API: 
   //   Currently:  clSetKernelArg(object_, 1, sizeof(int*), &a);
   //   Should be: clSetKernelArg(object_, 1, sizeof(int[2]), a );

The work around is to use

kernel.setArg(1,sizeof(cl_int2), a);

But the first method is cleaner and is a default way to do it, unless you know better. I have implemented a fix which I will post in the next message.

As described in my previous message, fixed size vector arguments are not handled correctly. To support correct handling of arguments of type float2, int4 , etc… I propose the following change


--- cl_orig.hpp	Thu Dec 17 16:21:25 2009
+++ cl.hpp	Thu Dec 17 15:40:17 2009
@@ -2990,10 +2991,39 @@
 template <typename T>
 struct KernelArgumentHandler
 {
-    static ::size_t size(const T& value) { return sizeof(T); }
-    static T* ptr(T& value) { return &value; }
+  static ::size_t size(const T& value)   { return sizeof(T); }
+  static void const* ptr(T const& value) { return &value;    }
 };
 
+template <typename T>
+struct KernelArgumentHandler< T[2] >
+{
+  static ::size_t size(const T value[2])   { return sizeof(T[2]); }
+  static void const* ptr(const T value[2]) { return value;        }
+};
+
+template <typename T>
+struct KernelArgumentHandler< T[4] >
+{
+  static ::size_t size(const T value[4])   { return sizeof(T[4]); }
+  static void const* ptr(const T value[4]) { return value;        }
+};
+
+template <typename T>
+struct KernelArgumentHandler< T[8] >
+{
+  static ::size_t size(const T value[8])   { return sizeof(T[8]); }
+  static void const* ptr(const T value[8]) { return value;        }
+};
+
+template <typename T>
+struct KernelArgumentHandler< T[16] >
+{
+  static ::size_t size(const T value[16])   { return sizeof(T[16]);}
+  static void const* ptr(const T value[16]) { return value;        }
+};
+
+
 template <>
 struct KernelArgumentHandler<LocalSpaceArg>
 {
@@ -3225,7 +3255,7 @@
      * generated.
      */
     template <typename T>
-    cl_int setArg(cl_uint index, T value)
+    cl_int setArg(cl_uint index, T const & value)
     {
         return detail::errHandler(
             ::clSetKernelArg(

Basically I define template specialisations for data types T[2],T[4],T[8] and T[16] for any basic type T. Also with my compiler (Microsoft ® 32-bit C/C++ Optimizing Compiler Version 13.10.3052 for 80x86) I had to change

Kernel::setArg(index, T value)   // TO
Kernel::setArg(index, T const & value)

for this scheme to work, but I think this might be a more proper way of doing it anyway.