Local extremum?


I have a 2D image and I perform a classical local extremum (min-max) by reading the 8-neighborhood. My code is simple but I think absolutely not optimal for GPU.

Is there a know algorithm to do this task? (haven’t found yet)

I was thinking about overlapping blocks where each block reads a part of the image, put it in a local buffer, and perform the local min-max by reading this buffer.


I don’t think its that bad for GPU if you use Image2D because the area will be cached by the texture unit. So the cache misses wont be that bad. you could load the data to local memory but i dont think it will speedup that much when you only load that few neigbours.

Yeap, you were right, it’s already pretty fast! It’s definitely not the kernel I have to optimize…
Thanks Clint.