How can i split my work load in a GPU with OpenCL?

I know that my question is simple, but i have a doubt.

It’s not clear about the difference between Workgroup and wavefront

At first, i have the global work size, that is the size of my work load and in principle is amount of kernels.

Local work size is the amount of kernel of each computer unit, so a work-group for each compute unit.

Is all of this aforementioned correct?

But, when i read ATI Stream Computing OpenCL, there are a concept about wavefront. There explain that wavefront is a set of work-items, and up to 4 work items can be pipelined in the stream core for processing. And, that the size of wavefront is especific of each gpu board configuration.

I think that each wavefront is for each compute unit. So, what is the function of work-group? Where i send a work-group?
A group of wavefront being a work-group, it’s not clear how i will split my workload in a GPU.

I think thats my problem is concepts error.

I know that is simple my question, but it’s not clear for me.



Hi Luiz,

Standard OpenCL doesn’t have the notion of a “wavefront”, also called a “warp” in NVidia’s terms. Knowing about warps is important when you are writing high-performance applications, but it’s not necessary to worry about them as a beginner in my opinion.

The global work size represents the total amount of work that you want to execute in parallel. The local work size, also called “work group size” represents how the global work size is divided into smaller pieces. Work-groups are important because all work-items inside the same work-group can communicate efficiently using local memory and local atomics. Local memory cannot be used to communicate between two different work-groups.

My advice is to think only in terms of work-groups and local memory. Once you become familiar with these two things and how to use them effectively then it will be a good moment to read more about warps/wavefronts.

Thanks David for your help,

My problem is that i have to write a high-performance code for my master
degree. So, i really have to understand this concepts of wavefront.

Do you know where can i get a great reference about wavefronts?

I want to ask something that you said in previous message.
Can i associate a work-group to a compute unit? A work-item to a
stream processor?

Very Thanks,

Luiz Drumond.

Before we run we need to learn how to walk :slight_smile:

NVidia’s OpenCL and CUDA guides talk about warps; you may want to give them a look. (Again, I wouldn’t worry about warps at this time)

Can i associate a work-group to a compute unit?

No. You can only assign a whole NDRange to a device. The OpenCL implementation will take care of assigning work-groups to the available compute units inside the device.

A work-item to a stream processor?

Neither. Same as above. In a GPU at least this kind of scheduling will be done in hardware.