Kernel undeterministic behavior

Hello everyone,

So I was thinking the best way to go at this is just attach my code and explain what is going on. (I cant seem to have attach option on this forum).

So what I am doing is I want to parallelize the betweenness algorithm of Barnes.

For the example: ./hello_world -parallel -grid 3 4 with two work_groups(6 threads per workgroup) the following graph is constructed
0 1 2 3
4 5 6 7
8 9 10 11 so a grid.

Now the value of sigma for node 0 is 1, and for all other nodes is 0. The kernel computes the sigma for all the other nodes in parallel, which means that sigma of a node will be the sum of his own sigma and the sigma of the root.

So sigma of node 1 and 4, will be value 1, sigma of node 5 will be 2 etc.

Now the problem is that in the while loop “while(count_priv<nr_roots)” count_priv will be initially 0 and it so happens that once every 4-5 runs of the program, count_priv does not get incremented for one of the work_groups. So what happens is that the iteration gets done with count_priv = 0 , gets incremented and when the while test (while(count_priv<nr_roots)) is done the value is still 0. Any ideas why that might be?

Thank you for your time and hope someone has a solution for me!

Well i notice you’re not using a local barrier at the start of the loop (the first 2 global barriers). Remember global barriers only work within the workgroup - they are not ‘global’.

Semaphores in general are problematic for gpu code since work-items within a workgroup are not independent threads - see the archives.

I know, but that’s not what is wrong. I know exactly what’s wrong, but it’s really hard to explain exactly. What are my options? I really need to get this done, its master thesis dependent :(. Do you know anyone that I can talk to ? Thanks!

If you know “exactly what’s wrong”, then why did you ask why it isn’t working?

That’s no way to get help.

Anyway - masters or homework, any school work is yours and yours alone.

Well the problem is that the program has a behavior that I dont understand. I just need someone that can help me in finding the misunderstanding. Thank you for your time!