OpenCL C++ Bindings

oddhack · September 2, 2009, 8:38pm

I’ve asked internally if we can open up the entire Khronos Registry area to public RO Subversion access, like we already have with the OpenGL Registry. It will take a few weeks to get people inside Khronos engaged on the discussion and come to a conclusion. I’m hopeful of a yes answer towards the end of September.

duanmu · September 22, 2009, 10:41am

Great job! However, Program::getInfo() does not work (e.g., CL_PROGRAM_BINARY_SIZES) if the context has multiple devices.

coleb · October 1, 2009, 5:06pm

I’ve asked internally if we can open up the entire Khronos Registry area to public RO Subversion access, like we already have with the OpenGL Registry. It will take a few weeks to get people inside Khronos engaged on the discussion and come to a conclusion. I’m hopeful of a yes answer towards the end of September.[/quote]

I’m looking forward to this. I would like to start a project that is SWIG bindings around the C++ bindings. Then you can theoretically use OpenCL from any language SWIG supports, Python, Java, CSharp, etc. We have an extensive amount of SWIG experience we can contribute.

system · October 15, 2009, 10:48am

Hi,

can’t find any methods for creating a buffer from an opengl renderbuffer or texture. Don’t they exist, or didn’t i find them?
If they don’t exist, when will they be added.

danbartlett · October 15, 2009, 10:58am

They can be found at:
http://www.khronos.org/registry/cl/
in particular:
http://www.khronos.org/registry/cl/api/1.0/cl_gl.h
http://www.khronos.org/registry/cl/extensions/khr/cl_khr_gl_sharing.txt

But you may still need to wait a while longer for NVidia/ATI to implement these fully.

system · October 15, 2009, 11:36am

there i found them. i ment i didn’t find them in the c+±bindings.
Will they be added in near future?
how can i look up, what nvidia/ati drivers support until now? (e.g. where can i see which opencl-extensions they support)

dbs2 · October 15, 2009, 11:57pm

You can query a device to see what extensions it supports. If cl/gl sharing is not listed, then it won’t support those extensions. I know those are supported on the Mac, but I’m not sure about AMD/Nvidia.

eyebex · October 27, 2009, 9:09am

Any update on this, being end of October now?

eyebex · October 27, 2009, 9:17am

First of all, thanks for the C++ bindings. Being new to the bindings, I was looking at Platform::get() and assumed it supported to get NULL passed in order to query just the number of available platforms. But when passing NULL, it crashes, because the pointer is not checked. It’s probably not very useful to query just the number of platforms after all, but in that case I’d recommend to make the argument a reference instead of a pointer to emphasize the argument is mandatory and NULL is not a valid value. In general, I highly recommend to “Use references when you can, and pointers when you have to.”, see [1].

[1] “When should I use references, and when should I use pointers?”
http://www.parashift.com/c+±faq-lite/r … ml#faq-8.6

eyebex · October 27, 2009, 10:28am

BTW, I’m aware now that my assumption was complete nonsense, as get() does not return the number of available platforms, but an int to indicate the success of the call. But due to the lacking docs for the static get() method I did not realize this until looking at the source code =:-)

bgaster · November 13, 2009, 2:56pm

It is a good point about the pointers over references. The reasons pointers are used in certain places is that they map directly to where this are map as pointers in the underlying C model. In general, I agree that references should be used but I do not particularly like the fact that C++ allows something to be side-effected without indicating it on the side of the call. I understand the argument about the NULL pointer and will think more about this.

bgaster · November 13, 2009, 2:59pm

We have C++ bindings for the GL extensions that we will be releasing in the near future.

eyebex · November 13, 2009, 3:12pm

I assume you mean CL extensions?

system · November 13, 2009, 3:24pm

cl_extension for oepngl-context sharing

can you announce it here, when you release them?

thanks

bgaster · November 13, 2009, 3:35pm

Yes I did mean CL

I will be sure to comment here when they go up.

anton · November 26, 2009, 3:28am

Hi Ben, I’m really enjoying reading the C++ bindings header - great job!

I’m also trying to learn from it how to use the C API properly. One thing I notice is that in cl::Platform::get() you call clGetDeviceIDs twice: first to get the number of available platforms, and then to read them. The first call has the following parameters: (num_entries = 0, *platforms = NULL, *num_platforms = &n). According to Section 4.1 of the standard:

"The number of OpenCL platforms returned is the mininum of the value specified by num_entries or the number of OpenCL platforms available.

But if the number of returned platforms is the minimum, then it should be 0, not 1 as returned both by AMD’s and NVIDIA’s implementations. So where’s the bug – in the standard or in the code?

eyebex · November 26, 2009, 3:43am

I guess the bug is with you The “number of OpenCL platforms returned” refers to the number of platforms written to the “platforms” pointer, not to the integer returned in “num_platforms”.

anton · November 26, 2009, 3:52am

eyebex, you are right. But still there is a bug - a typo in the word “mininum” :).

P.S. I don’t like when an API call can serve two separate functions: in this case, to get the number of available platforms and to get the platforms proper. But I see how it reduces the total number of API functions, which is a good thing.

kirill · December 16, 2009, 9:14pm

I have recently started using OpenCL via C++ binding, which are of great help. I have been trying to pass int2 arguments to the kernel and found that a most obvious way to do it does not work with the C++ bindings.

const char* OpenCLSource =
 "__kernel void test1(__global uint* out, int2 a){"
 "  unsigned int i = get_global_id(0);"
 "  out[i] = a[i%2];}"
int main(void) {
  cl_int2  a = {0,1};
  const int sz = 128;
  cl::Buffer _out( context,CL_MEM_WRITE_ONLY, sz*sizeof(unsigned int) );
//Setup code omitted for brevity..............
    cl::KernelFunctor func = kernel.bind(queue,cl::NDRange(sz),cl::NDRange()   );
    func(_out, a).wait(); //Generates incorrect calls to C API: 
   //   Currently:  clSetKernelArg(object_, 1, sizeof(int*), &a);
   //   Should be: clSetKernelArg(object_, 1, sizeof(int[2]), a );

The work around is to use

kernel.setArg(1,sizeof(cl_int2), a);

But the first method is cleaner and is a default way to do it, unless you know better. I have implemented a fix which I will post in the next message.

kirill · December 16, 2009, 9:40pm

As described in my previous message, fixed size vector arguments are not handled correctly. To support correct handling of arguments of type float2, int4 , etc… I propose the following change


--- cl_orig.hpp	Thu Dec 17 16:21:25 2009
+++ cl.hpp	Thu Dec 17 15:40:17 2009
@@ -2990,10 +2991,39 @@
 template <typename T>
 struct KernelArgumentHandler
 {
-    static ::size_t size(const T& value) { return sizeof(T); }
-    static T* ptr(T& value) { return &value; }
+  static ::size_t size(const T& value)   { return sizeof(T); }
+  static void const* ptr(T const& value) { return &value;    }
 };
 
+template <typename T>
+struct KernelArgumentHandler< T[2] >
+{
+  static ::size_t size(const T value[2])   { return sizeof(T[2]); }
+  static void const* ptr(const T value[2]) { return value;        }
+};
+
+template <typename T>
+struct KernelArgumentHandler< T[4] >
+{
+  static ::size_t size(const T value[4])   { return sizeof(T[4]); }
+  static void const* ptr(const T value[4]) { return value;        }
+};
+
+template <typename T>
+struct KernelArgumentHandler< T[8] >
+{
+  static ::size_t size(const T value[8])   { return sizeof(T[8]); }
+  static void const* ptr(const T value[8]) { return value;        }
+};
+
+template <typename T>
+struct KernelArgumentHandler< T[16] >
+{
+  static ::size_t size(const T value[16])   { return sizeof(T[16]);}
+  static void const* ptr(const T value[16]) { return value;        }
+};
+
+
 template <>
 struct KernelArgumentHandler<LocalSpaceArg>
 {
@@ -3225,7 +3255,7 @@
      * generated.
      */
     template <typename T>
-    cl_int setArg(cl_uint index, T value)
+    cl_int setArg(cl_uint index, T const & value)
     {
         return detail::errHandler(
             ::clSetKernelArg(

Basically I define template specialisations for data types T[2],T[4],T[8] and T[16] for any basic type T. Also with my compiler (Microsoft ® 32-bit C/C++ Optimizing Compiler Version 13.10.3052 for 80x86) I had to change

Kernel::setArg(index, T value)   // TO
Kernel::setArg(index, T const & value)

for this scheme to work, but I think this might be a more proper way of doing it anyway.