Hi folks; it’s been a while but I thought I’d post my final problem here and see if it resonates with anyone.

I have a kernel that runs beautifully on nVidia, which I am vectorizing for an AMD 5870.

I’m doing some trig, so to eliminate branching I have turned things like

if ( T < 0.f ) T += 360.f;

into

T += ( T < 0.f ) * 360.f;

( This is of course necessary to as to process all 4 elements of the vector in one go; if the branch were used then it would have to be performed individually per each element of the vector.)

… all cool, and the logic is good; it works for float1s. (And, doesn’t hurt performance???, even if it’s doing 8 complex calculations for 8 different conditions; I am happily surprised…)

-> However, when you’re using float4s, the value of a comparison is different. Instead of getting +1 back from a logical comparison, you get -1. SO, in order to use it in a calculation, as above, it’s necessary to somehow change that -1 to a +1 for the equation to yield what it needs to.

This hangs the 5870.

I have #defined FLOGIC(x) to take i.e. ( T < 0.f ) and change the sign of the result, in a number of ways:

#define FLOGIC(x) (float4) -(x);

#define FLOGIC(x) (float4) (x) * -1.f;

#define FLOGIC(x) (float4) ( x * -1 );

#define FLOGIC(x) (float4) ( abs( x ) );

#define FLOGIC(x) fabs( (float4) x );

. . . if I don’t do this, if I use the result of the logical comparison as originally depicted way up above, then the kernel compiles and runs beautifully except for the fact that the -1 ruins all the calculations it touches and the results are useless.

. . . if I *DO* do this, if I attempt any of the above-described methods to reverse the sign of the float4 logical comparison, the kernel never comes back. (Same deal if I use a function instead of a #define. I can do anything in that function or #define +except+ change the sign without terminally messing things up.)

It sails through clBuildProgram and clCreateKernel, enqueues fine, and then clFinish hangs the whole machine. [ Mac Pro ]

The cursor still follows the mouse around but the clock is frozen and so is everything else, requiring a hard boot.

Does anybody have any ideas?

Thanks

Dave

p.s. what fun to be here in opencl’s early days, huh?