Early evaluators for a portable compute IR (ONNX → CPU/GPU via OpenCL)

Hi all,
I am looking for a few technically-oriented early evaluators for a small experimental IR project focused on portable compute for ML workloads.

The current MVP supports importing a subset of ONNX (Linear/MatMul, Softmax, Add, ReLU, LayerNorm), lowering to a custom IR (JSON), and executing on:

  • CPU (NumPy backend)
  • GPU (OpenCL backend)

The evaluators would:
(1) load an ONNX model (small MLP/CNN),
(2) run it via AXIR → CPU & AXIR → OpenCL,
(3) compare outputs (ALLCLOSE),
(4) report correctness gaps, ops coverage, or lowering issues.

The goal at this stage is correctness parity and lowering feedback, not performance.

If you are familiar with ONNX / MLIR / lowering pipelines / GPU backends and want to try it out or provide feedback on IR design choices, feel free to reply here or DM me.

Thanks in advance.