normalize( float4 ) precision errors

I am using OpenGL’s float Pixel Buffer Objects to compute normals, a pretty simple processing. I am getting normalization errors when I read back the data onto the CPU to validate everything went right. I am getting 3D vectors of length going from 0.5 to 1.5 even if I applied the normalize( float4 ) function on them.

Here is how I link my PBO to OpenCL :

//  Create cl_mem object from PBO
m_clmemNormalMapOut = clCreateFromGLBuffer( OpenCLManager::getInstance()->getContext(), CL_MEM_WRITE_ONLY, gluintPBODestID, &m_ciErrNum );

//  Before the kernel execution :
m_ciErrNum = clEnqueueAcquireGLObjects( OpenCLManager::getInstance()->getCommandQueue(), 1, &m_clmemNormalMapOut, 0,0,0);


// Run kernel
m_ciErrNum = clEnqueueNDRangeKernel( OpenCLManager::getInstance()->getCommandQueue(), ckKernel, 2, NULL, szWorkSize, NULL, 0,0,0 );

//  After execution
m_ciErrNum = clEnqueueReleaseGLObjects( OpenCLManager::getInstance()->getCommandQueue(), 1, &m_clmemNormalMapOut, 0, 0, 0 );

m_ciErrNum = clFinish( OpenCLManager::getInstance()->getCommandQueue() );

The OpenCL code look like this (I use a 4D PBO to copy it into an RGBA32F texture):

//  The kernel parameter for the PBO is
__global float * out_fTexNormalMap

//  Compute normal with cross product and input data
float4 v4Normal;

v4Normal.w = 0.0;
v4Normal = normalize( v4Normal );

//	Write final vector
unsigned int uiID = 4 * (x + y * width);

out_fTexNormalMap[uiID + 0] = v4Normal.x;
out_fTexNormalMap[uiID + 1] = v4Normal.y;
out_fTexNormalMap[uiID + 2] = v4Normal.z;
out_fTexNormalMap[uiID + 3] = 0.0;

Before using it in another computation step, I read back data from PBO to the CPU this way :

glBindBuffer( GL_PIXEL_PACK_BUFFER, in_uiPBOID );
GLfloat* ptr = (GLfloat*)glMapBuffer( GL_PIXEL_PACK_BUFFER, GL_READ_ONLY );

memcpy( m_pkfNormalMap, ptr, uiPixelWidth * uiPixelHeight * 4 * sizeof(GLfloat) );

glUnmapBuffer( GL_PIXEL_PACK_BUFFER );
glBindBuffer( GL_PIXEL_PACK_BUFFER, 0 );

And I will scan every normal to make sure it’s length is around 1.0, but a lot of them will have wrong values.

The code is pretty simple, I wonder why the length won’t be 1.0. I know float precision could cause errors with values of 1.05 or 0.985, but 0.5 to 1.5 seems to be too much.

Am I missing something?
Thanks for the help.

The values returned by the normalize() function --as well as all the other builtin functions-- are strictly measured in the OpenCL Conformance Tests.

Have you confirmed that the inputs to the kernel are what you think they are and that they do not contain extreme cases such as denormalized floats, infinities, NaN, etc?

It was due to a bad ordering of index when reading back the data.

Everything is fine…