Wow my first experience with open cl its wealth of annoying and weird crashes:\
Now I got an OUT_OF_RESOURCES (the most common error when something is bad on GPU code, for me) when accessing a struct field. Here is the code snippet explaining what works and what does not work:
when I’ve changed ‘models’ buffer to constant memory, it starts working… But unfortunately, I need it on global memory because I need to perform writes somewhere in the code:( And the 64kb for Constant memory will be too small for future developments of my system…
Hoping that this is not a bug in drivers and you can point me the right way to solve it:\
The code you posted is incomplete. Where is the “event” variable used in the inner loop declared? If it’s declared at program scope it is implicitly a __constant variable and can’t be assigned to.
Also, one of the best ways to determine if it’s a driver bug is to try your code on different implementations. There are 3 more implementations out there: Apple, AMD, and IBM. Even compiling for the CPU on Apple and AMD can give you a lot of debugging help.
Unfortunately I don’t have any other implementation available to use:\
Its possible to run AMD OpenCL implementation on Intel processors?
btw, Event there is not much important because its only there to show that in some way it works, but not in the desirable way. But here is the complete and real code that working,
the only change that I’ve made was change __global Model* models to __constant Model* models… But I need to write to models, and with constant I can’t:\
__kernel void RunSimulation(__global char* outBuffer,
__global int* numberOfChilds,
__global Event* currentList,
__global Event* schedulerList,
__constant Model* models,
__global int* indexDiagraph,
__global int* diagraph,
int bufferSize,
int clock) {
__global Event *event;
unsigned int idx = get_global_id(0);
if(idx >= MAX_EVENTS)
return;
if(!currentList[idx].dirty && currentList[idx].time <= clock) { //We don't have a window yet
for(int j=0; j< numberOfChilds[currentList[idx].origId] ; j++) { //For each NOT child, lets process the Event
switch (models[diagraph[indexDiagraph[currentList[idx].origId] + j]].type) { //On GPU, models is a buffer that holds only the model's type
case NOT_GATE:
event = &schedulerList[idx];
schedulerList[idx].dirty = false;
event->origId = origId;
event->time = time;
event->out = output;
break;
case OBSERVER:
break;
default:
break;
}
}
}