I’m puzzled about Wavefront after reading some materials.
I have two questions that confuse me.
(1) If there are 4-way SIMDs(16-lane) in a single compute unit and more than 64 work-items running within the compute unit, will the wave scheduler launch all the work-items on all of the four SIMDs or just on one SIMD? In this way, do we call the wavefront with the length of 16 or 64?
(2) If there are 2 kernels running on a GPU, will the threads owned by different kernels(in different command queue) run in the same compute units simultaneously?
Sincerely for your help!!!