Removing pointers from SPIRV

I’m writing CFG lowering passes to try to remove all physical pointers from SPIRV. One thing I’m seeing is out-of-bounds AccessChain ops as a result of the mem2reg pass:

The ranged-for is notionally:

float* begin = array;
float* end = array + 3;
while(begin != end) {
float& x = *begin;
x = 1;
++begin;
}

What’s going on is the mem2reg pass is killing the storage for x, and replacing all uses of x with its initializer *begin, which is a PHI node of an OpAccessChain (the original decay from array to a pointer) and OpPtrAccessChain (++begin). Then the pointer-removal pass goes and splits apart pointer-yielding PHI ops into their base- and index- scalarized halves to replace OpPtrAccessChain as OpAccessChain. That gets rid of the pointers in the control flow, and it also transforms the *begin value of x to the %22 = OpInBoundsAccessChain.

While this is now valid Logical addressing code, it does contain an out-of-bounds InBoundsAccessChain before the conditional branch. I’m probably going to add a pass to do some dominator tree work that drags AccessChain instructions closer to their users to avoid this. I should add I’m not using LLVM–all these passes are homegrown with the intent of allowing Logical SPIRV run C++ code.

Question 1: Is one-past-the-end OpInBoundsAccessChain into a non-struct composite type valid? It is valid in LLVM, but the SPIRV docs are silent on that.

Question 2: Are there any code gen algorithms or literature or anything to help remove pointers from an IR? I can’t find anything. What I’m trying to do is totally reasonable, but I don’t see any prior art on lowering C++ to shaders. SPIRV would be the ideal mechanism for accomplishing this, but the only frontends I’ve seen are shader langs that lack these challenging high-level constructs. Writing all these passes in isolation is very hard.

Good questions.

Q1: It looks like you’re targeting Shader (graphics APIs). In this case, I would advise that executing an OpInBoundsAccessChain so an index forces it to go out of bounds of the underlying object (at any level of the hierarchy) would cause undefined behaviour. Sorry this is not evident.

Q2: For prior art in compiling C/C++ to graphics-flavoured SPIR-V, see Clspv https://github.com/google/clspv Most of its flow uses LLVM IR, but with restrictions and assumptions, and then emits SPIR-V GLCompute shaders.

The SPIRV-Tools project also has a lot of SPIR-V transforms as part of its “optimizer” suite. See the source tree at https://github.com/KhronosGroup/SPIRV-Tools/tree/master/source/opt That is mostly for graphics-flavoured SPIR-V (Shader), though it assumes the input is already valid.

In the case you have presented, I agree that sinking the access chain into the loop body should avoid the problem. In general, it seems sensible that you can rely on the idea that whenever the access itself is valid, then generating the pointer should also be valid.