Question about HW/SW paltform selection


I have finished my OpenCL code on intel CPU, and verified the correctness by AMD APP and Intel OpenCL tool set.
Now, I want to run the code on some GPU platform to see how well will it be. My questions is:

  1. For complicated kernel (relative big code size and local and private memory), Nvidia or AMD GPU, which performance is better?
  2. From perspective of development tools convenience, complete, and support resources, Nvidia or AMD, which is better?
  3. I heard that CUDA perform better than OpenCL in Nvidia GPU, because Nvidia like CUDA more than OpencCL. is that true? Does it mean that AMD pay more attention to OpenCL and offer better tools?
  4. Anybody recommend a graphic card which has better performance over price?

Thank you so much!



Given the latest stuff from nvidia which have dropped gp-gpu tuning in favour of graphics performance, AMD seem like the only choice now - nvidia are forcing compute users to their high-end high-price stuff. Nvidia obviously considered gpgpu not mainstream enough to warrant the overheads, and AMD on the other hand have no other option for making their cpu’s faster (their long-term hsa stuff). Actually as far as opencl goes, amd have been the likely choice for a while since nvidia only seems to have it as a bullet-point on the box and only ever talks about cuda.

Both have equally shitty tools! Actually they are ok, but it depends on what your expectations are - some people seem to expect way too much. AMD forums have AMD people frequent them, nvidia ones don’t seem to (but i haven’t looked for a while).

3 just doesn’t make sense - both cuda and opencl c are much closer together than different. And apart from that with nvidia they both are translated into the same intermediate representation (.il) and both run on the same hardware: although one could probably choose an algorithm to perform badly on one or the other, they’re just the same thing.

similar question:

benchmarks of every card under the sun: (opencl benchmarks lower down, not great but better than nothing) … s,135.html

On paper & in benchmarks the 7xxx series (GCN) are better at more complex code than the earlier iterations, although I don’t yet have first-hand experience with them (will soon tho).