I have some functions working with float8. During optimization it turns out that math with native lenght, like float4, vectors works faster. I don’t want to re-write all code I have, but just split my float8 functions to two float4 functions calls.
void f_native(float4 a)
//do something in vector4 math
void f(float8 a)
float4* ta = &a;
Nvidea SDK issues warnings about this code. Is it any proper way to do such conversion without expencive performance overhead?