Has anyone seen a performance boost from using UBOs?
Just searched on Ilian and UBOs since I know he’s one poster that’s commented on std uniform vs. UBO perf. Sounds like UBOs may be stored in device global memory and pulled by the shader, whereas plain ol’ uniforms may be stored in device local memory and pushed to the shader (to use OpenCL memory space names). The latter you might expect to be faster to access.
Worth reverifying though…
AFAIK, UBOs are in fact stored in device global memory but once they’re bound the content is moved to device local memory so the shader does not have to pull it from global memory. That is the reason behind the severe size limitation of UBOs (64K).
Check my article about it: http://rastergrid.com/blog/2010/01/uniform-buffers-vs-texture-buffers/