Let’s say I’ve got a shader, probably a compute shader that is doing a whole lot of matrix and vector math, and at the end of this math it performs a dot product between two vectors which would result in a single scalar value. This shader is being executed potentially a large number of times on different data, and since it’s a shader it does so essentially in parallel. The net result would be that each “thread” of the shader would result in its own scalar value, meaning I’d have an array of these values, one element for each “thread”. Is there any way to have the shaders merge the individual values into a single number? An example of this could be taking an average of all the values, or producing a single boolean based upon the sign of all the scalar values.
I realize that I could just use the CPU to iterate through all the results from the shader and compute such results that way, but I thought perhaps it could be more efficient to have the shader which is already doing lots of computing on the values could simply merge the results of multiple threads into the single value that I’m interested in.
One possibility that I thought could make some sense is if there was some “global” variable location in GLSL. For the averaging example it could simply have an initial value of 0 and each shader “thread” would add its result to this variable. In the end of have a single value that just needs to be divided by the number of threads and I’d have the average. However I see that doing such a thing as that would probably enforce that the threads must take turns adding their result to the global variable, which is the same as iteration in the end, and I’d be better off using my CPU.
Anyway I’m still learning about graphics cards, and I was mostly just curious to know how or if separate GL threads can effectively share and condense results. I’ve heard of people using GLSL to implement sorting algorithms in parallel, which seems like a related challenge to me.