Help with GPU to CPU alignment

Hi,
I have a GLSL compute shader, using some SSBO buffers, but I struggle to retrieve the content of a buffer, because of data alignments. It’s to update a neural network layer, for an opensource project (very early development though). Any help will be appreciate.
Here the GLSL parts (the sizes of the arrays may vary, here size 10 is for example):

struct Neighbor {
  bool is_used;
  uint index_x;
  uint index_y;
  vec4 weight; // weight of the neighbor connection
};

struct HiddenNeuron {
  uint index_x;
  uint index_y;
  vec4 weights[10][10];
  Neighbor neighbors[4];
};

layout(std430, binding = 3) buffer HiddenLayer1 {
  HiddenNeuron neurons[10][10];
  vec4 values[10][10];
  vec4 errors[10][10];
  float activation_alpha;
  uint activation_function;
  uint size_x;
  uint size_y;
}
hiddenLayer1Buffer;

And here how I try to get the data for now, but something goes wrong with aberrant data, so I guess it’s an alignment issue.

  builder_.mapBufferMemory(bufferHiddenLayer);

  uint8_t *bufferData = reinterpret_cast<uint8_t *>(bufferHiddenLayer.data);
  uint offset = 0;

  // Read neurons
  for (size_t y = 0; y < hiddenLayer->size_y; ++y) {
    for (size_t x = 0; x < hiddenLayer->size_x; ++x) {
      auto &dstNeuron = hiddenLayer->neurons[y][x];
      // check index_x and index_y
      uint index_x = 0;
      std::memcpy(&index_x, bufferData + offset, sizeof(uint32_t));
      offset += sizeof(uint32_t);

      uint index_y = 0;
      std::memcpy(&index_y, bufferData + offset, sizeof(uint32_t));
      offset += sizeof(uint32_t);

      if (dstNeuron.index_x != index_x || dstNeuron.index_y != index_y) {
        throw VulkanControllerException("Invalid data buffer memory");
      }

      // get weights
      for (int i = 0; i < dstNeuron.weights.rows; ++i) {
        for (int j = 0; j < dstNeuron.weights.cols; ++j) {
          for (int k = 0; k < 4; ++k) {
            float value = 0.0f;
            std::memcpy(&value, bufferData + offset, sizeof(float));
            offset += sizeof(float);
            dstNeuron.weights.at<cv::Vec4f>(i, j)[k] = value;
          }
        }
      }
      // get neighbors
      for (int i = 0; i < MAX_NEIGHBORS; i++) {
        bool isUsed = false;
        std::memcpy(&isUsed, bufferData + offset, sizeof(uint32_t));
        offset += sizeof(uint32_t);

        uint index_x = 0;
        std::memcpy(&index_x, bufferData + offset, sizeof(uint32_t));
        offset += sizeof(uint32_t);

        uint index_y = 0;
        std::memcpy(&index_y, bufferData + offset, sizeof(uint32_t));
        offset += sizeof(uint32_t);

        for (int k = 0; k < 4; k++) {
          float w = 0.0f;
          std::memcpy(&w, bufferData + offset, sizeof(float));
          offset += sizeof(float);
          if (isUsed) {
            dstNeuron.neighbors[i].weight[k] = w;
          }
        }
      }
    }

builder_.unmapBufferMemory(bufferHiddenLayer);

Regards,

What is the C++ definition of the data structures? If it’s the same as the GLSL ones, then I can already see two alignment problems. HiddenNeuron::weights will be 16-byte aligned in GLSL but only 4-byte aligned (probably?) in C++. Similarly, Neighbor::weight will be 16-byte aligned in GLSL but only 4-byte aligned in C++.

You need to insert some manual padding or alignas declarations in your C++ struct to match the GLSL std430 layout.

Thanks for the reply, indeed there is an issue with GLSL and my C++ alignments. Thanks for the padding and alignas advices. I’m still struggling, but I see the issue better :slight_smile:
My current code is at GitHub - obewan/SIPAI at 52-vulkan-refactoring especially the libs/libsipai/src/VulkanController.cpp and its _readBackHiddenLayer1() method. I have added some align(16), but after reading the isUsed boolean, neigh_index_x and neigh_index_y are aberants, and so I throw the VulkanControllerException.
I guess the bool is not align to 16 bits.

I found the solution, effectively it was by adding some padding and offset in the C++ code to match the SPIR-V assembly. RenderDoc can show the assembly and the offsets, as well as the offset of each data in a buffer, which has save my project :slight_smile: Many thanks to the RenderDoc author(s), a great tool.