memory coalescing

nagar781 · February 7, 2013, 10:44pm

hello,
actually i am working on memory coalescing technique. and i have searched so much but hardly able to get code regarding this issue. Do you have code regarding this?

clint3112 · February 7, 2013, 11:46pm

I think the best explanation on that will ne the lectures here:
https://developer.nvidia.com/cuda-training
For Memory Coalescing, have a look at

CUDA University Courses

University of Illinois : ECE 498AL
Taught by Professor Wen-mei W. Hwu and David Kirk, NVIDIA CUDA Scientist.
–> Memory Bank Conflicts (115 MB)

Dithermaster · February 9, 2013, 7:18am

There are many nuances and details, but for simple kernels the key element is this: For adjacent work items, you want them accessing adjacent memory. Sometimes this means doing things in a counter-intuitive fashion. An example is using a 1D kernel to process 2D images (there are reasons why you’d want to do this) – you should run it on columns and not rows (i.e., interpret get_global_id(0) as X) because then for each iteration of the Y loop inside the kernel the work items will be accessing horizontally adjacent pixels.