vector datatypes

I am trying to get a simple example program to compile and run using the float4 datatype and the compiler says the float4 datatype is unknown. I am using Linux Ubuntu 11.04 and the g++ compiler. Simple example:

#include <CL/opencl.h>

float4 a,b,c

int main()

c = a + b;


How does one get the vector datatypes to run?

float4 is an OpenCL type, so use it in your kernels. In standard C you cannot use it as such except defining your own float4 vector type. But then, you are not able to “overload” the + operator…

OpenCL is actually made of two components: an API called “OpenCL” and a programming language called “OpenCL C”. The API is used to allocate resources like buffers and images. The language is used to describe the algorithms that run on the compute device.

OpenCL applications are typically written in standard C or C++ and they contain calls to the OpenCL API. Since the application is written in C/C++, it does not have any of the vector data types of OpenCL C. That’s why you are seeing compilation errors.

I recommend you to look at some simple examples online to get an idea of how your application is going to look like and how to load source code written in OpenCL C into your app.

I see example kernels that have pointers to float4 as arguments such as

kernel void foo( global float4* arg )

Where is the float4 memory allocated? It must be already done before the kernel is called and set up with as an argument in the OpenCL API code which typically is a .cpp file, If there is a simple example I would appreciate it.

The memory is allocated on host. Because you can only specify the number of bytes in clCreateBuffer(), it is up to you how to manage the actual data. So, you’d allocate enough memory to store your float4 vectors

float *vectors = malloc(4 * number_of_vectors * sizeof(float)))

and assign the vector elements accordingly

/* first vector */
vectors[0 + 0] = vectors[0 + 1] = vectors[0 + 2] = vectors[0 + 3] = 0.0f;

You also may need to have the buffer aligned so you could use posix_memalign in place of malloc in the above to insure that the float4 is on a 16-byte boundary.