A VERY VERY simple COMPLETE compute shader example program!

bbb878bbb · March 26, 2020, 9:04pm

Can somebody please provide a VERY SIMPLE (add two numbers in an array and pass back the value!)
COMPUTE SHADER code?

Every time I look online people always give some complicated long example, and then it is just code snippets. Please provide the WHOLE C++ CODE that I can download, compile then run that adds two values from an input buffer of two GL_FLOATs and passes back the sum.

Please make the code as SIMPLE and SMALL as possible!

You folks that are very knowledgable in OpenGL could probably type the whole program up in a few minutes!

Thanks!
Ben

tomg · March 28, 2020, 3:31pm

I am a bit of a beginner with this stuff also. I have concluded that any opengl program will have to have a degree of complication. Especially applying textures with a shader. But I found the OGLdev tutorial to be quite useful: google ogldev atspace

PS If you find a good source for the shader language, I would like to hear about it.

Unless I misunderstand your question. Are you trying to use the graphics card to do math? For that basic idea the NVIDIA Cuda cores work like this:

// from  NVIDIA even-easier-introduction-cuda/  

#include <iostream>
#include <math.h>

// function to add the elements of two arrays
// CUDA Kernel function to add the elements of two arrays on the GPU
__global__
void add(int n, float *x, float *y)
{
  for (int i = 0; i < n; i++)
      y[i] = x[i] + y[i];
}

int main(void)
{
  int N = 1<<20; // 1M elements
 std::cout << "N: " << N << std::endl;

  //float *x = new float[N];
  //float *y = new float[N];

  // Allocate Unified Memory -- accessible from CPU or GPU
  float *x, *y;
  cudaMallocManaged(&x, N*sizeof(float));
  cudaMallocManaged(&y, N*sizeof(float));

  // initialize x and y arrays on the host
  for (int i = 0; i < N; i++) {
    x[i] = 1.0f;
    y[i] = 2.0f;
  }

  // Run kernel on 1M elements on the CPU
  //add(N, x, y);
  
  // Run kernel on 1M elements on the GPU
  add<<<1, 1>>>(N, x, y);
  // Wait for GPU to finish before accessing on host
  cudaDeviceSynchronize();
  
  // Check for errors (all values should be 3.0f)
  float maxError = 0.0f;
  for (int i = 0; i < N; i++)
    maxError = fmax(maxError, fabs(y[i]-3.0f));
  std::cout << "Max error: " << maxError << std::endl;

  // Free memory
  cudaFree(x);
  cudaFree(y);
  //delete [] x;
  //delete [] y;

  return 0;
}

bbb878bbb · March 28, 2020, 9:31pm

Thanks! I was looking more for straight OpenGL shader language (compute shader) code.

A good website for learning OpenGL is: google learnopengl

numzero · March 28, 2020, 9:48pm

Using compute shaders isn’t hard, but setting up OpenGL requires some boilerplate. Do you have that? Do you have the code to compile conventional shaders? If so, it’s little more than calling glDispatchCompute using a shader like this:

#version 430
layout (local_size_x = 1) in;
layout(location = 0) uniform float a;
layout(location = 1) uniform float b;
layout(std430, binding = 0) buffer result {
	writeonly restrict float c;
};
void main() {
	c = a + b;
}

dustinjamescondon · April 26, 2020, 1:39am

I’m looking for the same thing as the topic-creator. Thank you for your example! Could you explain how one would access the result c from the CPU side of things? I’m brand new to OpenGL and have been reading for the past few days for the main purpose of doing non-graphics related computation on the GPU. From your example I am a bit confused about the difference between a binding and a location. Any explanation would be appreciated.