Problem with image object and clEnqueueReadImage

Hello, I’m new in this forum, so hello to everyone.
I have a problem with buffer when i try to use image object.
Well, i’m trying to do an example of Opencl that take an image and return a copy of image in input.
This is the code of kernel


__kernel void PROVA1(__read_only image2d_t imageIn, __write_only image2d_t imageOut) {
const int nImageWidth = get_global_size(0);
const int nImageHeight = get_global_size(1);
const int xOut = get_global_id(0);
const int yOut = get_global_id(1);
const sampler_t sampler=CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP | CLK_FILTER_NEAREST;
float4 pixel;
pixel=read_imagef(imageIn,sampler,(int2)(xOut,yOut));
write_imagef (imageOut,(int2)(xOut,yOut),pixel);
}

i create image object with this code


const char* cImageFile = "StoneRGB.bmp";
cl_uint uiImageWidth = 1920;
cl_uint uiImageHeight = 1080;	
unsigned int* uiInput = NULL;	
BYTE* uiOutput = NULL;

unsigned int * uiInput = (unsigned int *)display1.bitmap.GetBits();
display1.bitmap.ReleaseBits();

size_t szBuffBytes=uiImageWidth * uiImageHeight * sizeof (unsigned int);
size_t szOutBytes=uiImageWidth * uiImageHeight * sizeof (BYTE);
cl_image_format image_format;
image_format.image_channel_order = CL_RGBA;
image_format.image_channel_data_type = CL_UNSIGNED_INT8;
cl_mem_flags flagsIn=CL_MEM_READ_ONLY;
cl_mem_flags flagsOut=CL_MEM_READ_WRITE;
input_image = clCreateImage2D(GPUContext, flagsIn, &image_format, uiImageWidth, uiImageHeight, 0, NULL, &ciErr);
output_image = clCreateImage2D(GPUContext, flagsOut, &image_format, uiImageWidth, uiImageHeight, 0, NULL, &ciErr);
ciErr = clSetKernelArg(OpenCLKernel, 0, sizeof(cl_mem), &input_image);
ciErr = clSetKernelArg(OpenCLKernel, 1, sizeof(cl_mem), &output_image);
int iBlockDimX = 16;
int iBlockDimY = 4;
szLocalWorkSize[0] = iBlockDimX;
szLocalWorkSize[1] = iBlockDimY;
szGlobalWorkSize[0] = shrRoundUp((int)szLocalWorkSize[0], uiImageWidth); 
szGlobalWorkSize[1] = shrRoundUp((int)szLocalWorkSize[1], uiImageHeight);
const size_t szTexOrigin[3] = {0, 0, 0};
const size_t szTexRegion[3] = {uiImageWidth, uiImageHeight, 1};	
ciErr = clEnqueueWriteImage(GPUCommandQueue, input_image, CL_TRUE, szTexOrigin, szTexRegion, szBuffBytes, 0, uiInput , 0, NULL, NULL);
ciErr = clEnqueueNDRangeKernel(GPUCommandQueue, OpenCLKernel, 2, NULL, szTexRegion, szLocalWorkSize, 0, NULL, NULL);
ciErr = clEnqueueReadImage(GPUCommandQueue, output_image, CL_TRUE, szTexOrigin, szTexRegion, szOutBytes, 0, uiOutput, 0, NULL, NULL);
clFinish(GPUCommandQueue);

The problem happens when the program is on clEnqueueReadImage and this is the code of error

Unhandled exception in testOpenCl.exe (NVCUDA.DLL): 0xC0000005: Access Violation

If i try this

ciErr = clEnqueueReadImage(GPUCommandQueue, output_image, CL_TRUE, szTexOrigin, szTexRegion, 0, 0, uiOutput, 0, NULL, NULL);

the probram finish but i don’t have desired result

Help me guys

uiOutput is NULL.

In addition, szOutBytes is computed incorrectly. The image format is RGBA with a data type of CL_UNSIGNED_INT8, which means that each pixel takes 4*8 = 32 bytes. You should be doing something like this:

size_t szOutBytes=uiImageWidth * uiImageHeight * sizeof (cl_uchar) * 4;
uiOutput = (BYTE*)malloc(szOutBytes);
if(!uiOutput) throw_error();

Also the arguments passed to clEnqueueReadImage are wrong.

It should be:

ciErr = clEnqueueReadImage(GPUCommandQueue, output_image, CL_TRUE, szTexOrigin, szTexRegion, 0, 0, uiOutput, 0, NULL, NULL);

Notice that by passing a row_pitch value of zero you are telling the CL to pack the image tightly. See section 5.3.3. of the CL 1.1. spec.

Thank you very much…i pass the problem. Now i don’t use the read the buffer with clEnqueueReadImage but i use clEnqueueReadBuffer.