Noob question here.
I have some existing cuda-based codes (pytorch and some custom math). I have a new algorithm that adds to the existing cuda-based codes. The new algorithm needs to be reusable both on a CPU and a GPU. My understanding is that OpenCL codes can run on both CPU and GPU. If true, then is it possible for the new OpenCL algorithm to be compiled into the same process with the existing cuda codes and run on the same GPU? Our goal is to get all these codes to run entirely in the GPU for maximum speed in machine learning trainings.
So, is it possible for OpenCL codes to coexist with the cuda codes and run in the same process in a fast efficient manner? Meaning, faster than if I had the OpenCL codes running on the CPU and passing data to the cuda codes in the GPU each iteration?