Crash when accessing struct fields

vandop · May 20, 2010, 3:34am

Hello,

Wow my first experience with open cl its wealth of annoying and weird crashes:\

Now I got an OUT_OF_RESOURCES (the most common error when something is bad on GPU code, for me) when accessing a struct field. Here is the code snippet explaining what works and what does not work:

__kernel void RunSimulation(__global char* outBuffer,
		__global int* numberOfChilds,
		__global Event* currentList,
		__global Event* schedulerList,
		__global Model* models,
		__global int* indexDiagraph,
		__global int* diagraph,
		int bufferSize,
		int clock) {

unsigned int idx = get_global_id(0);
	if(idx >= MAX_EVENTS)
	  return;

if(!currentList[idx].dirty && currentList[idx].time <= clock) { 
	  for(int j=0; j< numberOfChilds[currentList[idx].origId]; j++) {
		  if(models[diagraph[indexDiagraph[currentList[idx].origId] + j]].type == NOT_GATE) {
									  event = schedulerList[idx]; /***WORKING*****/
									  schedulerList[idx].dirty = 0; /****NOT WORKING*****/	 
		  }

But I’ve one more clue about what is working or not. If I move the code, that do not work, out of the if, it starts working,

if(!currentList[idx].dirty && currentList[idx].time <= clock) { 
	  for(int j=0; j< numberOfChilds[currentList[idx].origId]; j++) {
								  schedulerList[idx].dirty = 0; /****NOW ITS WORKING*****/	 
		  }

Event struct is defined:

typedef struct Event {
	short dirty;
	short origId;
	int time;
	int out;
}__attribute__ ((aligned(16))) Event;

Driver: NVIDIA UNIX x86_64 Kernel Module 195.36.15 on Fedora 12

Sorry about English.

Thank you

vandop · May 20, 2010, 4:14am

Additional info:

when I’ve changed ‘models’ buffer to constant memory, it starts working… But unfortunately, I need it on global memory because I need to perform writes somewhere in the code:( And the 64kb for Constant memory will be too small for future developments of my system…

Hoping that this is not a bug in drivers and you can point me the right way to solve it:\

Thank you

coleb · May 20, 2010, 9:52am

The code you posted is incomplete. Where is the “event” variable used in the inner loop declared? If it’s declared at program scope it is implicitly a __constant variable and can’t be assigned to.

Also, one of the best ways to determine if it’s a driver bug is to try your code on different implementations. There are 3 more implementations out there: Apple, AMD, and IBM. Even compiling for the CPU on Apple and AMD can give you a lot of debugging help.

-Brian

vandop · May 21, 2010, 3:31am

Unfortunately I don’t have any other implementation available to use:\

Its possible to run AMD OpenCL implementation on Intel processors?

btw, Event there is not much important because its only there to show that in some way it works, but not in the desirable way. But here is the complete and real code that working,
the only change that I’ve made was change __global Model* models to __constant Model* models… But I need to write to models, and with constant I can’t:\

__kernel void RunSimulation(__global char* outBuffer,
		__global int* numberOfChilds,
		__global Event* currentList,
		__global Event* schedulerList,
		__constant Model* models,
		__global int* indexDiagraph,
		__global int* diagraph,
		int bufferSize,
		int clock) {
  
	  __global Event *event;
	  
	  unsigned int idx = get_global_id(0);
	  if(idx >= MAX_EVENTS)
	    return;

	if(!currentList[idx].dirty && currentList[idx].time <= clock) { //We don't have a window yet
	  for(int j=0; j< numberOfChilds[currentList[idx].origId] ; j++) { //For each NOT child, lets process the Event
	      switch (models[diagraph[indexDiagraph[currentList[idx].origId] + j]].type) { //On GPU, models is a buffer that holds only the model's type
	      case NOT_GATE:
		      	event = &schedulerList[idx];
			schedulerList[idx].dirty = false;
			event->origId = origId;
			event->time = time;
			event->out = output;
		      break;
	      case OBSERVER:
		      
		      break;
	      default:
		      break;
	      }
	   }
	}

coleb · May 21, 2010, 9:19am

Everything looks kosher to me. Though it doesn’t preclude an issue on the host side.

AMD’s implementation will run on Intel CPUs, so give it a try.

-Brian

vandop · May 24, 2010, 2:08am

Thank you, I will:)