I have a process in which each thread makes a convergence test and what I would like to is being able to determine at which step the convergence has been reached by every thread. I can have an int array, each thread assigning 0 or 1 depending on whether it satisfies the convergence teste or not but now how for each thread to test that the array contains only zeros? More generally how this is generally achieved?
Rather than test to see if the array contains only zeros, think about what you really want to test, which is “are any threads non-converged?”. You could do this with an a global that gets set by any thread that isn’t converged. If this were a counter, you’d need it to be atomic, but if non-converged threads are only writing a single value, I think it could be non-atomic. If you already have a return buffer, this could be a single value in that buffer. Or create a new buffer. You could try writing an image as well to see if the texture write caching changes the performance.