SPIR-V Storage Buffer Array problem?

I seem to be having a problem while trying to optimize a shader where suddenly I only get 0 values.

I used to have this structure:


layout(push_constant) uniform gPushConstants
{
	uint perInstanceIndex;
};

struct PerInstance
{
	mat4 world;
	mat4 worldIT;
}

layout(std430, row_major, set = 0, binding = 1) buffer PerFrameInstanceBuffer
{
	PerInstance perInstance[];
};

And then in the shader:

vec4 wsPosition = vec4(iPosition, 1.0) * perInstance[perInstanceIndex].world;

This would be bound to a buffer object that would hold a LARGE number of matrices. Now because the number of matrices is large and part of the matrix is always (0, 0, 0, 1) (I use row major, so world would have the 3rd row (0,0,0,1) and worldIT would have the 3rd column (0,0,0,1)) I thought I would reduce memory usage by doing something like this (note that I am basically switching here from row major world matrix to column major matrix, and I account for this (I have also tested with identity matrix, but that doesn’t help)):

struct PerInstance
{
	vec4 worldCol[3];
	vec4 worldIT[3];
};

layout(set = 0, binding = 1) buffer PerFrameInstanceBuffer
{
	PerInstance perInstance[];
};

And then building the matrix back in the shader (for sake of simplicity let’s only do world)


mat4 world;
world[0] = perInstance[perInstanceIndex].worldCol[0];
world[1] = perInstance[perInstanceIndex].worldCol[1];
world[2] = perInstance[perInstanceIndex].worldCol[2];
world[3] = vec4(0,0,0,1);

However when doing this, nothing works… I have RenderDoc’ed the code and I can see that the data is uploaded correctly, but I can’t debug the shader (because only D3D11 shader debugging seems to be currently supported.)

Now in order to see which values come through, I’ve rewired the shader to use an identity matrix and to output to the color channel instead:


mat4 world;
world[0] = vec4(1,0,0,0);
world[1] = vec4(0,1,0,0);
world[2] = vec4(0,0,1,0);
world[3] = vec4(0,0,0,1);

oColor = perInstance[0].worldCol[0];

But it seems this is always black, no matter what values I upload in the buffer (and confirmed with RenderDoc that the values are there)

Then I tried reading through the SPIR-V, to see if I could figure out what’s going on, but the “human readable” format doesn’t seem easy to untangle (it’s easy to read alright, just not to comprehend) (I’ve attached it as a zip, since it exceeded the 19kb attachement size)

I tried copy/pasting some sample code together, but it seems that’ll take a couple hours to do (creating window, vkInstance, vkDevice, Swapchain, etc,etc) so hoping someone has the SPIR-V skills to easily see what’s wrong.

I was wondering if anyone has had issues with this kind of setup and if so if there are any ideas and/or suggestions on how to fix/workaround this.

Thank you in advance.

PS: Copy/paste of the SPIR-V output as apparently the attachment of ZIP doesn’t work:

mesh.vert
// Module Version 10000
// Generated by (magic number): 80007
// Id’s are bound by 128

                          Capability Shader
           1:             ExtInstImport  "GLSL.std.450"
                          MemoryModel Logical GLSL450
                          EntryPoint Vertex 4  "main" 64 76 88 89 98 100 110 121 123
                          Source GLSL 450
                          SourceExtension  "GL_ARB_separate_shader_objects"
                          Name 4  "main"
                          Name 10  "world"
                          Name 17  "PerInstance"
                          MemberName 17(PerInstance) 0  "worldCol"
                          MemberName 17(PerInstance) 1  "worldIT"
                          Name 19  "PerFrameInstanceBuffer"
                          MemberName 19(PerFrameInstanceBuffer) 0  "perInstance"
                          Name 21  ""
                          Name 22  "gPushConstants"
                          MemberName 22(gPushConstants) 0  "perInstanceIndex"
                          Name 24  ""
                          Name 44  "worldIT"
                          Name 61  "wsPosition"
                          Name 64  "iPosition"
                          Name 74  "gl_PerVertex"
                          MemberName 74(gl_PerVertex) 0  "gl_Position"
                          MemberName 74(gl_PerVertex) 1  "gl_PointSize"
                          MemberName 74(gl_PerVertex) 2  "gl_ClipDistance"
                          MemberName 74(gl_PerVertex) 3  "gl_CullDistance"
                          Name 76  ""
                          Name 78  "gPerViewConstantBuffer"
                          MemberName 78(gPerViewConstantBuffer) 0  "viewProjection"
                          Name 80  ""
                          Name 88  "oNormal"
                          Name 89  "iNormal"
                          Name 98  "oTangent"
                          Name 100  "iTangent"
                          Name 110  "oBinormal"
                          Name 121  "oTexcoord"
                          Name 123  "iTexcoord"
                          Name 125  "PerFrameConstantBuffer"
                          MemberName 125(PerFrameConstantBuffer) 0  "time"
                          Name 127  ""
                          Decorate 15 ArrayStride 16
                          Decorate 16 ArrayStride 16
                          MemberDecorate 17(PerInstance) 0 Offset 0
                          MemberDecorate 17(PerInstance) 1 Offset 48
                          Decorate 18 ArrayStride 96
                          MemberDecorate 19(PerFrameInstanceBuffer) 0 Offset 0
                          Decorate 19(PerFrameInstanceBuffer) BufferBlock
                          Decorate 21 DescriptorSet 0
                          Decorate 21 Binding 1
                          MemberDecorate 22(gPushConstants) 0 Offset 0
                          Decorate 22(gPushConstants) Block
                          Decorate 64(iPosition) Location 0
                          MemberDecorate 74(gl_PerVertex) 0 BuiltIn Position
                          MemberDecorate 74(gl_PerVertex) 1 BuiltIn PointSize
                          MemberDecorate 74(gl_PerVertex) 2 BuiltIn ClipDistance
                          MemberDecorate 74(gl_PerVertex) 3 BuiltIn CullDistance
                          Decorate 74(gl_PerVertex) Block
                          MemberDecorate 78(gPerViewConstantBuffer) 0 RowMajor
                          MemberDecorate 78(gPerViewConstantBuffer) 0 Offset 0
                          MemberDecorate 78(gPerViewConstantBuffer) 0 MatrixStride 16
                          Decorate 78(gPerViewConstantBuffer) Block
                          Decorate 80 DescriptorSet 1
                          Decorate 80 Binding 0
                          Decorate 88(oNormal) Location 2
                          Decorate 89(iNormal) Location 2
                          Decorate 98(oTangent) Location 0
                          Decorate 100(iTangent) Location 1
                          Decorate 110(oBinormal) Location 1
                          Decorate 121(oTexcoord) Location 3
                          Decorate 123(iTexcoord) Location 3
                          MemberDecorate 125(PerFrameConstantBuffer) 0 Offset 0
                          Decorate 125(PerFrameConstantBuffer) Block
                          Decorate 127 DescriptorSet 0
                          Decorate 127 Binding 0
           2:             TypeVoid
           3:             TypeFunction 2
           6:             TypeFloat 32
           7:             TypeVector 6(float) 4
           8:             TypeMatrix 7(fvec4) 4
           9:             TypePointer Function 8
          11:             TypeInt 32 1
          12:     11(int) Constant 0
          13:             TypeInt 32 0
          14:     13(int) Constant 3
          15:             TypeArray 7(fvec4) 14
          16:             TypeArray 7(fvec4) 14

17(PerInstance): TypeStruct 15 16
18: TypeRuntimeArray 17(PerInstance)
19(PerFrameInstanceBuffer): TypeStruct 18
20: TypePointer Uniform 19(PerFrameInstanceBuffer)
21: 20(ptr) Variable Uniform
22(gPushConstants): TypeStruct 13(int)
23: TypePointer PushConstant 22(gPushConstants)
24: 23(ptr) Variable PushConstant
25: TypePointer PushConstant 13(int)
28: TypePointer Uniform 7(fvec4)
31: TypePointer Function 7(fvec4)
33: 11(int) Constant 1
34: 6(float) Constant 0
35: 6(float) Constant 1065353216
36: 7(fvec4) ConstantComposite 34 35 34 34
38: 11(int) Constant 2
39: 7(fvec4) ConstantComposite 34 34 35 34
41: 11(int) Constant 3
42: 7(fvec4) ConstantComposite 34 34 34 35
62: TypeVector 6(float) 3
63: TypePointer Input 62(fvec3)
64(iPosition): 63(ptr) Variable Input
72: 13(int) Constant 1
73: TypeArray 6(float) 72
74(gl_PerVertex): TypeStruct 7(fvec4) 6(float) 73 73
75: TypePointer Output 74(gl_PerVertex)
76: 75(ptr) Variable Output
78(gPerViewConstantBuffer): TypeStruct 8
79: TypePointer Uniform 78(gPerViewConstantBuffer)
80: 79(ptr) Variable Uniform
81: TypePointer Uniform 8
85: TypePointer Output 7(fvec4)
87: TypePointer Output 62(fvec3)
88(oNormal): 87(ptr) Variable Output
89(iNormal): 63(ptr) Variable Input
98(oTangent): 87(ptr) Variable Output
99: TypePointer Input 7(fvec4)
100(iTangent): 99(ptr) Variable Input
110(oBinormal): 87(ptr) Variable Output
115: TypePointer Input 6(float)
119: TypeVector 6(float) 2
120: TypePointer Output 119(fvec2)
121(oTexcoord): 120(ptr) Variable Output
122: TypePointer Input 119(fvec2)
123(iTexcoord): 122(ptr) Variable Input
125(PerFrameConstantBuffer): TypeStruct 6(float)
126: TypePointer Uniform 125(PerFrameConstantBuffer)
127: 126(ptr) Variable Uniform
4(main): 2 Function None 3
5: Label
10(world): 9(ptr) Variable Function
44(worldIT): 9(ptr) Variable Function
61(wsPosition): 31(ptr) Variable Function
26: 25(ptr) AccessChain 24 12
27: 13(int) Load 26
29: 28(ptr) AccessChain 21 12 27 12 12
30: 7(fvec4) Load 29
32: 31(ptr) AccessChain 10(world) 12
Store 32 30
37: 31(ptr) AccessChain 10(world) 33
Store 37 36
40: 31(ptr) AccessChain 10(world) 38
Store 40 39
43: 31(ptr) AccessChain 10(world) 41
Store 43 42
45: 25(ptr) AccessChain 24 12
46: 13(int) Load 45
47: 28(ptr) AccessChain 21 12 46 33 12
48: 7(fvec4) Load 47
49: 31(ptr) AccessChain 44(worldIT) 12
Store 49 48
50: 25(ptr) AccessChain 24 12
51: 13(int) Load 50
52: 28(ptr) AccessChain 21 12 51 33 33
53: 7(fvec4) Load 52
54: 31(ptr) AccessChain 44(worldIT) 33
Store 54 53
55: 25(ptr) AccessChain 24 12
56: 13(int) Load 55
57: 28(ptr) AccessChain 21 12 56 33 38
58: 7(fvec4) Load 57
59: 31(ptr) AccessChain 44(worldIT) 38
Store 59 58
60: 31(ptr) AccessChain 44(worldIT) 41
Store 60 42
65: 62(fvec3) Load 64(iPosition)
66: 6(float) CompositeExtract 65 0
67: 6(float) CompositeExtract 65 1
68: 6(float) CompositeExtract 65 2
69: 7(fvec4) CompositeConstruct 66 67 68 35
70: 8 Load 10(world)
71: 7(fvec4) VectorTimesMatrix 69 70
Store 61(wsPosition) 71
77: 7(fvec4) Load 61(wsPosition)
82: 81(ptr) AccessChain 80 12
83: 8 Load 82
84: 7(fvec4) VectorTimesMatrix 77 83
86: 85(ptr) AccessChain 76 12
Store 86 84
90: 62(fvec3) Load 89(iNormal)
91: 6(float) CompositeExtract 90 0
92: 6(float) CompositeExtract 90 1
93: 6(float) CompositeExtract 90 2
94: 7(fvec4) CompositeConstruct 91 92 93 34
95: 8 Load 44(worldIT)
96: 7(fvec4) VectorTimesMatrix 94 95
97: 62(fvec3) VectorShuffle 96 96 0 1 2
Store 88(oNormal) 97
101: 7(fvec4) Load 100(iTangent)
102: 62(fvec3) VectorShuffle 101 101 0 1 2
103: 6(float) CompositeExtract 102 0
104: 6(float) CompositeExtract 102 1
105: 6(float) CompositeExtract 102 2
106: 7(fvec4) CompositeConstruct 103 104 105 34
107: 8 Load 10(world)
108: 7(fvec4) VectorTimesMatrix 106 107
109: 62(fvec3) VectorShuffle 108 108 0 1 2
Store 98(oTangent) 109
111: 62(fvec3) Load 88(oNormal)
112: 62(fvec3) Load 98(oTangent)
113: 62(fvec3) ExtInst 1(GLSL.std.450) 68(Cross) 111 112
114: 62(fvec3) ExtInst 1(GLSL.std.450) 69(Normalize) 113
116: 115(ptr) AccessChain 100(iTangent) 14
117: 6(float) Load 116
118: 62(fvec3) VectorTimesScalar 114 117
Store 110(oBinormal) 118
124: 119(fvec2) Load 123(iTexcoord)
Store 121(oTexcoord) 124
Return
FunctionEnd
mesh.vert
// Module Version 10000
// Generated by (magic number): 80007
// Id’s are bound by 130

                          Capability Shader
           1:             ExtInstImport  "GLSL.std.450"
                          MemoryModel Logical GLSL450
                          EntryPoint Vertex 4  "main" 61 73 85 86 95 97 107 118 120
                          Source GLSL 450
                          SourceExtension  "GL_ARB_separate_shader_objects"
                          Name 4  "main"
                          Name 10  "world"
                          Name 27  "worldIT"
                          Name 32  "PerInstance"
                          MemberName 32(PerInstance) 0  "worldCol"
                          MemberName 32(PerInstance) 1  "worldIT"
                          Name 34  "PerFrameInstanceBuffer"
                          MemberName 34(PerFrameInstanceBuffer) 0  "perInstance"
                          Name 36  ""
                          Name 37  "gPushConstants"
                          MemberName 37(gPushConstants) 0  "perInstanceIndex"
                          Name 39  ""
                          Name 58  "wsPosition"
                          Name 61  "iPosition"
                          Name 71  "gl_PerVertex"
                          MemberName 71(gl_PerVertex) 0  "gl_Position"
                          MemberName 71(gl_PerVertex) 1  "gl_PointSize"
                          MemberName 71(gl_PerVertex) 2  "gl_ClipDistance"
                          MemberName 71(gl_PerVertex) 3  "gl_CullDistance"
                          Name 73  ""
                          Name 75  "gPerViewConstantBuffer"
                          MemberName 75(gPerViewConstantBuffer) 0  "viewProjection"
                          Name 77  ""
                          Name 85  "oNormal"
                          Name 86  "iNormal"
                          Name 95  "oTangent"
                          Name 97  "iTangent"
                          Name 107  "oBinormal"
                          Name 118  "oTexcoord"
                          Name 120  "iTexcoord"
                          Name 127  "PerFrameConstantBuffer"
                          MemberName 127(PerFrameConstantBuffer) 0  "time"
                          Name 129  ""
                          Decorate 30 ArrayStride 16
                          Decorate 31 ArrayStride 16
                          MemberDecorate 32(PerInstance) 0 Offset 0
                          MemberDecorate 32(PerInstance) 1 Offset 48
                          Decorate 33 ArrayStride 96
                          MemberDecorate 34(PerFrameInstanceBuffer) 0 Offset 0
                          Decorate 34(PerFrameInstanceBuffer) BufferBlock
                          Decorate 36 DescriptorSet 0
                          Decorate 36 Binding 1
                          MemberDecorate 37(gPushConstants) 0 Offset 0
                          Decorate 37(gPushConstants) Block
                          Decorate 61(iPosition) Location 0
                          MemberDecorate 71(gl_PerVertex) 0 BuiltIn Position
                          MemberDecorate 71(gl_PerVertex) 1 BuiltIn PointSize
                          MemberDecorate 71(gl_PerVertex) 2 BuiltIn ClipDistance
                          MemberDecorate 71(gl_PerVertex) 3 BuiltIn CullDistance
                          Decorate 71(gl_PerVertex) Block
                          MemberDecorate 75(gPerViewConstantBuffer) 0 RowMajor
                          MemberDecorate 75(gPerViewConstantBuffer) 0 Offset 0
                          MemberDecorate 75(gPerViewConstantBuffer) 0 MatrixStride 16
                          Decorate 75(gPerViewConstantBuffer) Block
                          Decorate 77 DescriptorSet 1
                          Decorate 77 Binding 0
                          Decorate 85(oNormal) Location 2
                          Decorate 86(iNormal) Location 2
                          Decorate 95(oTangent) Location 0
                          Decorate 97(iTangent) Location 1
                          Decorate 107(oBinormal) Location 1
                          Decorate 118(oTexcoord) Location 3
                          Decorate 120(iTexcoord) Location 3
                          MemberDecorate 127(PerFrameConstantBuffer) 0 Offset 0
                          Decorate 127(PerFrameConstantBuffer) Block
                          Decorate 129 DescriptorSet 0
                          Decorate 129 Binding 0
           2:             TypeVoid
           3:             TypeFunction 2
           6:             TypeFloat 32
           7:             TypeVector 6(float) 4
           8:             TypeMatrix 7(fvec4) 4
           9:             TypePointer Function 8
          11:             TypeInt 32 1
          12:     11(int) Constant 0
          13:    6(float) Constant 1065353216
          14:    6(float) Constant 0
          15:    7(fvec4) ConstantComposite 13 14 14 14
          16:             TypePointer Function 7(fvec4)
          18:     11(int) Constant 1
          19:    7(fvec4) ConstantComposite 14 13 14 14
          21:     11(int) Constant 2
          22:    7(fvec4) ConstantComposite 14 14 13 14
          24:     11(int) Constant 3
          25:    7(fvec4) ConstantComposite 14 14 14 13
          28:             TypeInt 32 0
          29:     28(int) Constant 3
          30:             TypeArray 7(fvec4) 29
          31:             TypeArray 7(fvec4) 29

32(PerInstance): TypeStruct 30 31
33: TypeRuntimeArray 32(PerInstance)
34(PerFrameInstanceBuffer): TypeStruct 33
35: TypePointer Uniform 34(PerFrameInstanceBuffer)
36: 35(ptr) Variable Uniform
37(gPushConstants): TypeStruct 28(int)
38: TypePointer PushConstant 37(gPushConstants)
39: 38(ptr) Variable PushConstant
40: TypePointer PushConstant 28(int)
43: TypePointer Uniform 7(fvec4)
59: TypeVector 6(float) 3
60: TypePointer Input 59(fvec3)
61(iPosition): 60(ptr) Variable Input
69: 28(int) Constant 1
70: TypeArray 6(float) 69
71(gl_PerVertex): TypeStruct 7(fvec4) 6(float) 70 70
72: TypePointer Output 71(gl_PerVertex)
73: 72(ptr) Variable Output
75(gPerViewConstantBuffer): TypeStruct 8
76: TypePointer Uniform 75(gPerViewConstantBuffer)
77: 76(ptr) Variable Uniform
78: TypePointer Uniform 8
82: TypePointer Output 7(fvec4)
84: TypePointer Output 59(fvec3)
85(oNormal): 84(ptr) Variable Output
86(iNormal): 60(ptr) Variable Input
95(oTangent): 84(ptr) Variable Output
96: TypePointer Input 7(fvec4)
97(iTangent): 96(ptr) Variable Input
107(oBinormal): 84(ptr) Variable Output
112: TypePointer Input 6(float)
116: TypeVector 6(float) 2
117: TypePointer Output 116(fvec2)
118(oTexcoord): 117(ptr) Variable Output
119: TypePointer Input 116(fvec2)
120(iTexcoord): 119(ptr) Variable Input
127(PerFrameConstantBuffer): TypeStruct 6(float)
128: TypePointer Uniform 127(PerFrameConstantBuffer)
129: 128(ptr) Variable Uniform
4(main): 2 Function None 3
5: Label
10(world): 9(ptr) Variable Function
27(worldIT): 9(ptr) Variable Function
58(wsPosition): 16(ptr) Variable Function
17: 16(ptr) AccessChain 10(world) 12
Store 17 15
20: 16(ptr) AccessChain 10(world) 18
Store 20 19
23: 16(ptr) AccessChain 10(world) 21
Store 23 22
26: 16(ptr) AccessChain 10(world) 24
Store 26 25
41: 40(ptr) AccessChain 39 12
42: 28(int) Load 41
44: 43(ptr) AccessChain 36 12 42 18 12
45: 7(fvec4) Load 44
46: 16(ptr) AccessChain 27(worldIT) 12
Store 46 45
47: 40(ptr) AccessChain 39 12
48: 28(int) Load 47
49: 43(ptr) AccessChain 36 12 48 18 18
50: 7(fvec4) Load 49
51: 16(ptr) AccessChain 27(worldIT) 18
Store 51 50
52: 40(ptr) AccessChain 39 12
53: 28(int) Load 52
54: 43(ptr) AccessChain 36 12 53 18 21
55: 7(fvec4) Load 54
56: 16(ptr) AccessChain 27(worldIT) 21
Store 56 55
57: 16(ptr) AccessChain 27(worldIT) 24
Store 57 25
62: 59(fvec3) Load 61(iPosition)
63: 6(float) CompositeExtract 62 0
64: 6(float) CompositeExtract 62 1
65: 6(float) CompositeExtract 62 2
66: 7(fvec4) CompositeConstruct 63 64 65 13
67: 8 Load 10(world)
68: 7(fvec4) VectorTimesMatrix 66 67
Store 58(wsPosition) 68
74: 7(fvec4) Load 58(wsPosition)
79: 78(ptr) AccessChain 77 12
80: 8 Load 79
81: 7(fvec4) VectorTimesMatrix 74 80
83: 82(ptr) AccessChain 73 12
Store 83 81
87: 59(fvec3) Load 86(iNormal)
88: 6(float) CompositeExtract 87 0
89: 6(float) CompositeExtract 87 1
90: 6(float) CompositeExtract 87 2
91: 7(fvec4) CompositeConstruct 88 89 90 14
92: 8 Load 27(worldIT)
93: 7(fvec4) VectorTimesMatrix 91 92
94: 59(fvec3) VectorShuffle 93 93 0 1 2
Store 85(oNormal) 94
98: 7(fvec4) Load 97(iTangent)
99: 59(fvec3) VectorShuffle 98 98 0 1 2
100: 6(float) CompositeExtract 99 0
101: 6(float) CompositeExtract 99 1
102: 6(float) CompositeExtract 99 2
103: 7(fvec4) CompositeConstruct 100 101 102 14
104: 8 Load 10(world)
105: 7(fvec4) VectorTimesMatrix 103 104
106: 59(fvec3) VectorShuffle 105 105 0 1 2
Store 95(oTangent) 106
108: 59(fvec3) Load 85(oNormal)
109: 59(fvec3) Load 95(oTangent)
110: 59(fvec3) ExtInst 1(GLSL.std.450) 68(Cross) 108 109
111: 59(fvec3) ExtInst 1(GLSL.std.450) 69(Normalize) 110
113: 112(ptr) AccessChain 97(iTangent) 29
114: 6(float) Load 113
115: 59(fvec3) VectorTimesScalar 111 114
Store 107(oBinormal) 115
121: 116(fvec2) Load 120(iTexcoord)
Store 118(oTexcoord) 121
122: 40(ptr) AccessChain 39 12
123: 28(int) Load 122
124: 43(ptr) AccessChain 36 12 123 12 12
125: 7(fvec4) Load 124
126: 59(fvec3) VectorShuffle 125 125 0 1 2
Store 95(oTangent) 126
Return
FunctionEnd

So after going back to the original mat4 code… I tried outputting that to the color like so:

oColor = perInstance[perInstanceIndex].world[0].xyz;

And for the identity matrix that shows me bright red, which is expected since the first row/column is (1, 0, 0, 0). However doing this:

oColor = perInstance[perInstanceIndex].world[0][0].xxx;

Which should be (1, 1, 1) but it returns (0,0,0).