openCL GPGPU simplification

Hi all,
If anyone’s interested in giving a look and some feedback, I wrote an extension to the openCL api to reduce overhead and simplify kernel calling. The project’s on gitlab, you can find it under noam_abadi/cl_simple.

I’m not a professional programmer, but I use parallel computing for simulations and I’m quite happy with this, so wanted to get other people’s opinions on what’s missing, what could be removed, and what’s just plain bad. I don’t know if this is the right place to post it, but it seemed like a good start (sorry if I’m out of place here).

Kind regards and thank you,

thank you for your work. i am also a new learner for this.

You’re very welcome! I’m glad it’s useful

i view your repo. i am not familiar with gitlab , so i search your username Noam Abadi in search bar in user column. I had a cursory glance, In my shallow opinion, I am a relatively familiar latex user, but I still recommend markdown as the document format.i have a problem want to find help ,On Windows 10, NVIDIA GPU can execute Opencl program by me, how to do profiling, can use Nvidia Nsight compute or nvprof? I want to get the performance of GFLOPS when the program is running.

As far as I understand, nvidia should be able to run opencl code, but I’ve never tried it. I wrote this on a computer with an intel uhd620 integrated gpu. Recently I contacted a friend who has nvidia to start testing with him, but we haven’t tried this out yet. I am not familiar with nsight or nvprof, I used the latter only once and I think it’s specific to cuda code, so I’m not sure if you’d be able to use it with opencl. As for GFLOPS performance, that really depends on your gpu characteristics. From a quick search as reference, most nvidia gpu’s seem to have ~1000 cores and frequency ~1000Mhz, which means each core can do ~1GOPS. You’d need to push it a little to get GFLOPS I think (and RTX 3080 or above might do the trick).

i found a program name is CLtracer , but It’s hard to say it’s usable or useful. Maybe my test program’s footprint is too small to be captured. Thank your info.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.