Limiting number of compute units?

llaves · March 7, 2010, 12:28pm

Can I limit the number of compute units assigned to my kernel? For example, the device I choose might report 8 compute units, but I want only 4 to be dedicated to my job. I suppose if I have only four work_groups, that would do it, but many algorithms would not allow you to force the problem to fit in some given number of work_groups.

dominik · March 7, 2010, 3:12pm

I don’t think there’s an official way to do this. The only way I can think of is to limit the number of workgroups as you said.

I think this is an interesting question with regard to the new NVidia Fermi architecture that allows you to execute multiple kernels concurrently. Will there be a way of influencing the mapping of compute units to kernels or is this entirely managed by the hardware scheduler?

dbs2 · March 9, 2010, 10:14am

The only way to limit this is by specifying that you only want a total of 4 work-groups. (e.g., total size = 4 * work-group size). This will guarantee that you only use 4. However, there is no way to have a large number of work-groups and specify that they only use a certain part of the device. As new devices with multi-kernel capabilities come out the standard will have to evolve to enable priorities or some sort of more sophisticated resource management.