Problem run in GTX560


I build my code in AMD Athlon X2 and it’s work fine.

But, when i put to run in a GTX 560, the compilation process works fine, but a error of execution thats impossible to see occurs.

In theory if run in a cpu amd the same code should run in GTX 560.

Someone went through this? How someone know to solve?


Just because the code works with AMD’s OpenCL implementation does not mean it will work with Nvidia’s one. They support different versions of OpenCL - 1.2 for AMD vs. 1.0 for Nvidia.

What error is being generated? If its a segfault then the culprit could be that you are using some OpenCL 1.2 functions that Nvidia doesn’t supply. The same problem hit me when I started playing with the AMD APP SDK.


Thanks for your help.

The message error is CL_OUT_OF_HOST_MEMORY.

All functions that i have used in my code is for OpenCL 1.0.

Take a look that is happening, i grabbed an old version of my code, that at first works fine.

But, when i insert a blank space in any part of my code, this error occurs.
For instance,
int a; //Works fine.
int __a; //CL_OUT_OF_HOST_MEMORY occurs.

Consider ‘_’ being a blank space.
The forum excluded a blank space that i put.

This sounds like a bug in Nvidia’s compiler. You may want to try a few different driver versions to see when this bug crept in.

That said, make sure that you are checking the error code from every single OpenCL function call to make sure that the error isn’t happening somewhere earlier in your code. Be sure to rule out bugs in your code first.

Thakns for your help,

I try with ATI GPU, the same error don’t occurs, but now i have problem with update struct with method.

i have this method: void b_bisect( struct _intervalo *destino1, struct _intervalo *destino2, struct _intervalo *origem) that is in a file outside of kernel.

Inside kernel i have the call b_bisect(&caixa,&caixa2,caixa);

Inside of b_bisect some operations modify some variables of struct, but when back to kernel, caixa and caixa2 continue with old values, don’t modifies after b_bisect execute.

This occurs only in GPU, in CPU runs ok. Since i use AMD in CPU and GPU, the same SDK.

Very Thanks,


If b_bisect is declared in a different file to that of your kernel function then you must read in both and pass both to the OpenCL compiler as part of the same program. If you are doing that already then I’d expect that function call to work.

Are you checking every OpenCL call in the host code to see if there is an error being generated while compiling the kernel or executing it?

When compiling no errors occurs, and goes to GPU to run
The only trace of error that i make is value returned by clEnqueueNDRangeKernel.

In a ATI GPU, no error of execution occurs, but the behavior of the code is not the same in CPU and GPU, happening the error of not update this variables.


i found the problem, but i don’t have any ideia how to solve.

In this sentence
caixa[indice_maior].superior = medio;
In the first iteration, indice_maior value is 0.

After this line i put:
C1X 2)final = %.2f - medio = %.2f
Caixa com indice maior = %.2f",caixa[0].superior, medio, caixa[indice_maior].superior);

caixa[0].superior print 4, that is the old value.
caixa[indice_maior].superior print 0 that is the correct value.

But, indice_maior has the 0 value. Appear be differents positions in the vector, although indice_maior has 0 value, and should point to the same position.

How do you know that indice_maior is 0. Hopefully you’ve hard coded indice_maior=0; for debugging.

As a general suggestion, comment out all the code in your kernel. Then add back each line, each time rerunning your program to see which line causes the error on the GTX560 and to check at what point the results generated by the AMD GPU and CPU differ.

Just another point, structs on the CPU and GPU may have different sizes as fields can be padded differently. In the definition of the struct, do you explicitly define the alignment for the fields or force the compiler to always pack the fields in the structure by using attribute ((packed))?

Very Thanks for your help,

I did what you said.

I commented segments of my code to find error.

In the true, the problem was a set of things that led to a erroneous behavior.