Dive into Registers

Hi there,

I’m still not very shure about the registers. A little example:

__kernel void foo(const float param)
  unit id = get_global_id(0);
  float val = param * id;

Ignoring that this code will be completely optimized to nothing, how many registers would this kernel need? Two for the two internal values? 3 for the internals + the constant parameter?
I know that it is not that easy, because many operations need additional registers and i think even get_global_id needs some (after seeing the asm generated after compiling that kernel)
And what about that param. Is it in the constant memory (and therefor cached global mem with register speed) or somewhere locally? And if i have 30 Kernelparameters, will they all fit into that same cached memory?
Last Question: Does the Compiler recycle registers? does he test if it wont beneeded anymore and the registers can be reused for further local variables.

Thanks to all for any help.

The answer to this question depends entirely on the hardware vendor and the compiler they ship with their OpenCL implementation. The vendors are free to allocate registers using whatever scheme they choose.

:frowning: :frowning: :frowning:
Sad to hear. But i think on NVidia it should be the same as the way they do it in cuda? Am i right?