Hi,
I am starting to anderstand compute shader after working with OpenCL.
and i am not sure to anderstand the difference and how to use it.
Whit OpenCL you need to give the range ,example (1024,1024), and than give (x,y) range, exanple (2,2). wich means that (1024,1024) will be divided in workGroup of (2,2) so (1024*10247)/4 = 262144 workgroup of 4 thread for an array of (2,2)
If i want to do the same using compute shader i need to give the work group number inside glDispatchCompute(x,y,z) but also give information to the shader :
layout(local_size_x,local_size_y,local_size_z) in;
In the ARM documentation the shader information correspond to the number of thread that the workgroup is going to use, in my case 128 thread max per workGroup. And the workgroup max is 65535.
In fact i can see clearly that 65535 workgroup in compute shader is far less than 262144 using OpenCL.
is that means that to do the same as OpenCL i should do :
glDispatchCompute(512,512,1) and in shader layout(local_size_x = 2,local_size_y = 2) in;
to get workgroup of 4 thread and (512*512) workgroup
glDispatchCompute(256,256,1) and in shader layout(local_size_x = 4,local_size_y = 4) in;
to get workgroup of 16 thread and (256*256) workgroup
glDispatchCompute(128,128,1) and in shader layout(local_size_x = 8,local_size_y = 8) in;
to get workgroup of 64 thread and (128*128) workgroup
I need to use shared memory that is why i am asking. I want to be sure that i anderstoud ;))
So if i want to implement my OpenCL kernel in openGL compute shader i need to think differently or i may be wrong and misanderstoud something.
thank for explanation.