Nearest neighbour (Eucliedean distance)

Hello, I’m using OpenCL to find the nearest neighbour between two set of 3D points.

Nearest Neighbour: For each point(x,y,z) in the DataSet I have to find the nearest one in the model. Squared distance = (Ax-Bx)^2 + (Ay-By)^2 + (Az-Bz)^2

So I think that every point of the dataset need to have read access to ALL the points of the model. Is this the only way or are there other approaches?

Here what I’ve done so far:

struct point {
	int x;
	int y;
	int z;

__kernel void 
nearest_neighbour(__global struct point *model,
	__global struct point *dataset,
	__global int *nearest,
	const unsigned int model_size)
		int g_dataset_id = get_global_id(0);
		int dmin = -1;
		int d, dx, dy, dz;
		/* Ottimizzato per memoria locale */
		int local_x, local_y, local_z;

		local_x = dataset[g_dataset_id].x;
		local_y = dataset[g_dataset_id].y;
		local_z = dataset[g_dataset_id].z;
		/* Fine ottimizzazione */

		for (int i=0; i<model_size; ++i) {			
			dx = model[i].x - local_x;
			dx = dx * dx;
			dy = model[i].y - local_y;
			dy = dy * dy;
			dz = model[i].z - local_z;
			dz = dz * dz;
			d = dx + dy + dz;
			if(d < dmin || dmin == -1)
				nearest[g_dataset_id] = i;
				dmin = d;

The code seems to work, but I’m sure that it can be optimized.
The optimization is important because this kernel will be part of a bigger program.

Any advice?

Thanks :slight_smile:

P.S. I know that there are other (better) methods to find nearest neighbour, like kd-tree, but for now I would like to do the easy one.

On one hand you tell us “I’m sure it can be optimized” and “Optimization is important”, but on the other hand you don’t want to hear about well proven methods like kd-trees.

It sounds a bit like somebody with an implementation of bubble sort that says “How can I make it fast? I’ve heard of quicksort but I just want my bubble sort to be fast”.

Yeah - I agree. Any good programmer who has been around the block a few times will tell you this. In any optimization process, you FIRST switch to the most efficient algorithm available and THEN start to lean on the individual lines of code to see if you can squeeze out a little more.

Optimizing a crappy algorithm is a waste of time because sooner or later you’re going to replace it - and then all of that earlier effort is wasted.

– Steve

Optimizing a crappy algorithm is a waste of time because sooner or later you’re going to replace it - and then all of that earlier effort is wasted.

i’m new to opencl, so my first purpose is to learn, and before implement the kd-tree I would like to know if I’m doing ok the opencl part.

Also another good reason for implement first the naive nearest neighbour is that in the future I can compare the performance of this one with kd-tree or other algorithms.

Remember that i’m posting in the “Beginner” category.

Anyway if you tell me that my implementation is “perfect” and can’t be further optimized I can start to think about kd-tree and opencl :wink:

Hi, I’d be interested to know if you optimized your algorith and what changed - alternatively did you take the advice of the experienced crown and start with an existing algorith?

(Of course, Euclid never imaged distances on a sphere…)

I’m working on a new version based on kd-tree.

Anyway I’ve got a good answer on how to use local memory with the old algorithm: