Global memory alignment

In order to have coalesced access to global memory, memory addresses must increase sequentially across the work-items in the wavefront and start on a 128-byte alignment boundary.

my very newbie questions are: how the buffers created with clCreateBuffer are aligned (and in general every argument to a kernel function)? it depends also from the flags we choose during the creation? there’s some way to check if global memory access are coalesced on an amd platform?


There’s no real way to control this alignment. Since the driver does the data movement/allocation/management you can reasonably assume that the global structures will be nicely aligned for you. You are just responsible for aligning your accesses as needed.