Memory leak problem in nVidia driver


I face a memory leak problem in nVidia driver.

Windows7 64bit
Driver ver.296.70

Memory leak happens when you call the function clReleaseMemObject()
after calling clCreateBuffer().

And here is a sample code to verify that memory leak appers.

#include "CL/cl.h"
#include <stdio.h>
#include <string.h>
#include <assert.h>

int main(void)
    cl_int ret;
    cl_platform_id platforms[16];
    cl_uint numPlatforms;
    ret = clGetPlatformIDs(16, platforms, &numPlatforms);
    assert(ret == CL_SUCCESS);
    assert(numPlatforms < 16);

    int index;
    for (index = 0; index < (int)numPlatforms; index++) {
        char name[256];
        size_t retSize;
        ret = clGetPlatformInfo(platforms[index], CL_PLATFORM_NAME, sizeof name, name, &retSize);
        assert(ret == CL_SUCCESS);
        if (strcmp(name, "NVIDIA CUDA") == 0) {

    cl_platform_id platform = platforms[index];
    cl_device_id device;
    cl_uint numDevices;
    ret = clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, &numDevices);
    assert(ret == CL_SUCCESS);

    cl_context_properties props[] = {CL_CONTEXT_PLATFORM, (cl_context_properties)platform, 0};
    cl_context context = clCreateContext(props, 1, &device, NULL, NULL, &ret);
    assert(ret == CL_SUCCESS);

    const int COUNT_MAX = 10000000;
    for (int count = 0; count < COUNT_MAX; count++) {
            if ( count % 10000 == 0) printf("count=%d

            cl_mem clmemory = clCreateBuffer(context, CL_MEM_READ_WRITE, 128*128*sizeof(int), NULL, &ret);
            assert(ret == CL_SUCCESS);

            ret = clReleaseMemObject(clmemory);
            assert(ret == CL_SUCCESS);
    ret = clReleaseContext(context);
    assert(ret == CL_SUCCESS);

    return 0;

I found another thread “Possible Memory leak in nVidia driver” which is written about similar problem.
Refering to that thread, I tried calling clReleaseEvent() before clReleaseMemObject(), but the problem could not be sovled.

How can I avoid a memory leak problem?
Is there anyone who has alternatives?


How do you know it’s leaking? On my system (Linux, 295.40, CUDA 4.2.1) the loop finishes and nvidia-smi shows that all memory is released after the process is finished.

Tools I used are task manager and UMDH.

That problem can be found only on Windows, not on Linux. I also checked it on Linux and I found that there was no leaking as you mentioned.

Of course all memory is released after finishing the process.
The problem is to increase the amount of used memory during the process is running.

It’s not really fixing the root of your problem but I can only suggest to create the buffers upfront and re-use them through the course of your program.

Thank you for your suggestion. I wish I could do so.

As a result of various circumstances, I have some constraints to create program.
One of them is about changing buffer size.
OpenCL does not have API which re-size a buffer,
so that my program releases memory objects and create them many times.

Of course I checked up to create buffer with maximum size in advance,
but read_image and write_image do not work well in case that there is a difference
between size of image and one of calculation area.
Because the sampler with CLK_NORMALIZED_COORDS_TRUE and
image objects created by clCreateBuffer and clCreateImage2D are used in a program.

Is there any other way to avoid this problem?