Problem with normalized image coordinates


I’m learning OpenCL (I come from the GLSL world) and I’m playing with normalized image coordinates (this is the way I am used, so I would like know how they work in OpenCL) but I’m having unpleasant side effects.
In my CL kernel, I’m appliyng a simple gaussian kernel. I wasn’t realized the problem using small convolutions, but when I tried (just to play with) with big convolutions (21x21 and so) I realized out that there happens some breaking bands that didn’t happened using non-normalized coordinates.
I’m totally lost about this. I revised all code many times and with more eyes. I dont know if the error is in the normalization code, in the convolution code, in the images samplers… so I would appreciate any clue about this.

Here is the kernel:

__kernel void filter(__read_only image2d_t imageIn, __write_only image2d_t imageOut, sampler_t sampler, __global float* filter, int sizeSampler)
    int2 ids = (int2)(get_global_id(0),

    int2 imageSize = (int2)(get_image_width(imageIn),

    if (ids.x >= imageSize.x || ids.y >= imageSize.y)

    // Normalize coordinates
    float2 coords = (float2)((float)ids.x / (imageSize.x - 1),
                                     (float)ids.y / (imageSize.y - 1));
    // Compute the right pixel size
    float2 pixelSize = (float2)(1.0f / imageSize.x,
                                         1.0f / imageSize.y);

    // Compute the numer of steps I will have to move around the convolution
    float position = sizeSampler / 2;

    // Initial position of the convolution (left bottom)
    float2 startPos = {coords.x - position * pixelSize.x,
                              coords.y - position * pixelSize.y};
    // End position of the convolution (right top)
    float2 endPos = {coords.x + position * pixelSize.x,
                             coords.y + position * pixelSize.y};

    // Filter image
    int weight = 0;
    float4 pixel = (float4)(0.0f, 0.0f, 0.0f, 0.0f);   
    for(float2 finalCoords = startPos; finalCoords.y <= endPos.y; finalCoords.y += pixelSize.y)
        for (finalCoords.x = startPos.x; finalCoords.x <= endPos.x; finalCoords.x += pixelSize.x)
            pixel += read_imagef(imageIn, sampler, finalCoords) * filter[weight++];

    float4 pixelOut = (float4)(, 1.0f);
    write_imagef (imageOut, ids, pixelOut);

Also, here is the image, to reveal the defects (you can see the breaks and image duplication (because the mix caused by the application of the convolution) at the left part of the image:

I revised all code many times and with more eyes. I dont know if the error is in the normalization code, in the convolution code, in the images samplers

You haven’t given us the input image you used nor the output you expected, so we have to do some guessing.

Generally speaking, when a piece of code doesn’t work, try simplifying it while maintaining the bug. Eventually the reason for the bug will become apparent.

Try changing the image to something simpler so that you can measure the distances in pixels of the defects you see. Try a smaller image. Try a larger image. Do the defects scale with the size of the image? Etc.

Verify that the inputs to your kernel are what you think they are. For instance, whether the filter weights were loaded incorrectly. What kind of sampler are you using? I don’t see your convolution code doing any special operations on the edges so you are relying on the sampler for coordinate wrapping.

If your implementation supports a printf extension, you can use that as well.

Unrelated to your bug: try using vector operations in your kernels for readability. “float2 coords = ids / (imageSize - 1.0f);” should work fine.

We tried a lot of tests. With bigger convolutions, the deffect is bigger. With bigger images the effect is bigger (for instance, with a 512x512 source image, the break happens exactly at the middle of the image)
Another test we’d done is that, changing the code to use unnormalized image coordinates, all works fine, no errors, no breaks, no weird blendings…

To answer you some questions, the sampler we are using is clamp_to_edge | normalized_image_coords | nearest_filter.
We measured distances between cracks but we wasn’t able to relate the problem with them.

About vector operations, well, I’m aware. This code is a dirt piece of code after some hours debugging and ensuring that even the most basic operations were correct.

I really suspect that the problem is related with some kind of timing during the convolution application. I can’t think on any other issue that can cause that big pixels displacement during the convolution blending.

About the original image, its there:

And if you are serius, I can post the code that generates the gaussian kernel. I really though that this is not the problem, because the same kernel, but accesing the image with unnormalized texture coordinates, works perfectly.

P.D: Sorry for the big size of the images, but if I change the size, I will obscure the problem.

Have you tried running the same program on different implementations of OpenCL? If all implementations give you the same output then that points at a bug in the application.

Have you tried using a Dirac’s delta instead of a Gaussian filter? Do you get the results you would expect?

I’ve not played with normalised coords, but the looping on floating point values rings alarm bells to me. My guess is that you will end up with finalCords.x sometimes being > endPos.x rather than = on the last iteration, and so adding skew to your filter lookup.

I would probably loop on the kernel coordinates and calculate the the image coordinates the same way as you are already (or even on the fly as in general such calculations can be effectively scheduled between memory accesses).

e.g. perhaps as simple as:

int y=0;
for(float2 finalCoords = startPos; y<kernelheight; finalCoords.y += pixelSize.y,y++) {


My guess is that you will end up with finalCords.x sometimes being > endPos.x rather than = on the last iteration, and so adding skew to your filter lookup.

That cannot happen because the way the for loop is written, right? It looks like this:

{code}for (finalCoords.x = startPos.x; finalCoords.x <= endPos.x; finalCoords.x += pixelSize.x){code}

However, you bring a very good point: the number of iterations in these nested loops is not well defined due to floating point arithmetic. It may be doing more or less iterations than they thought, causing the kernel to read the wrong values and possibly even out of bounds of the “filter” variable.

I agree with the workaround you suggest.

I’ve found the problem. You two were right. The problem was caused by using floating point increments and comparisions. I’ve checked the loops control variables and, in many cases the number of iterations is not correct. I think that its due to rounding errors.

We have changed the loop to use integer increments and now it works fine, so asnotzed wisely pointed, sometimes I’m accessing wrongly to the filter lookup.

This is a very beginner programmer error (and I’m surely not a beginner in this field) I can remember my first programming teachings: “never do floating point loops”, so this is really my fault. But besides that, this is the first time that after I’ve found the solution, I still can’t see the relation with the symptoms of the problem.

I can’t explain why the wrong image is like its, with the cracks, and the behaviour of the problem with different image and convolution sizes. Also, I’ve found more weird things, like writting (1,0,1,1) and getting a totally white image.

Of course, when I will found the explanation, I’ll post it.

Well I meant that rather than x == endpos.x on what should be the last iteration, it will be just over - and so what should be the last iteration will be skipped and throw the kernel index out.

Anyway glad to hear it’s fixed and surprised nobody else spotted it :wink: