Problems with cascaded shadow maps

After some large programming sessions my mobile graphics engine is now compatible not only with OpenGLES but also with Vulkan. I followed some Vulkan tutorials but mainly the great Sascha Willems framework/examples. At this moment forward rendering, sorted transparencies, GPUSkinning…Works pretty well.

Next step was trying to add a depth render pass just to use it for several purposes like shadow mapping, I decided to implement it in the same way Sascha Willems did in ShadowMappingCascade example.

Depth render pass, glm math matrices, scene resources, shaders, light animation, camera parameters and cascade debugging are basically the same. I don’t flip Y geometry coordinate because I prefer making negative viewport height in addition to VK_KHR_MAINTENANCE1_EXTENSION_NAME extension as Sascha Willems explains at Flipping the vulkan viewport.

Unfortunately some rendering issues are produced and a lot of time and effort pass without finding the solution.

Let me explain a list of problems I’ve experimented:

a) 4 cascades are generated in the shadow render pass using a depth attachment, first cascade looks fine but next cascades seems to have problems with inverted z-values and corresponding framebuffers don’t clear depth values correctly.

Depth pass preparation:

...
attachmentDescription.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
attachmentDescription.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
...

Depth pass rendering sets clear values like this:

VkClearValue clearValues[1];
clearValues[0].depthStencil = { 1.0f, 0 };
...

b) Split behavior is incorrect, I always see the last cascade (debugged with yellow color). To choose the cascade index in the fragment shader the following condition: inViewPos.z > ubo.vCascadeSplits[i] is changed fragments are from near to far blue, green and red respectively (just the opposite than the Sascha example). I’m using right handled coordinate system. I tried to invert geometry z coordinate but it doesn’t fix the problem.

Scene fragment shader

...
// Get cascade index for the current fragment's view position
uint cascadeIndex = 0;
for(uint i = 0; i < SHADOW_MAP_CASCADE_COUNT - 1; ++i) 
{
   if(inViewPos.z > ubo.cascadeSplits[i]) 
   {
      cascadeIndex = i + 1;
   }
}
...

c) Shadow projection is wrong. Multiplying cascade light view projection matrix to world pos doesn’t generate a correct texture coordinate to sample the shadow texture for the current cascade, the matrix is the same used in depth pass for cascade generation and seems more or less correct. #define GLM_FORCE_ZERO_TO_ONE is declared.

Transform fragment world pos to cascade light space is done in the fragment shader as follows:

...
vec4 shadowCoord = (biasMat * ubo.cascadeViewProjMat[cascadeIndex]) * vec4(inPos, 1.0);
shadowCoord = shadowCoord / shadowCoord.w;
...

The below screenshot illustrates listed problems:

Has anyone experimented similar issues?

Thanks in advance

Hey there,

Just to get a better picture of the problem:

Are you using Vulkan validation layers?
have you tried to use debuggers like RenderDoc?
Are you using Vulkan memory allocator or do you have another solution for memory management?

best regards,
Johannes

Hi Johannes,

thanks for your reply.

I’m using below validation layers and no warning or message errors appears when executing the app:


VK_LAYER_GOOGLE_unique_objects
VK_LAYER_LUNARG_core_validation
VK_LAYER_LUNARG_object_tracker
VK_LAYER_GOOGLE_threading
VK_LAYER_LUNARG_parameter_validation

My development platform is MacOS + Android Studio. Some time ago I installed RenderDoc just to have a powerful graphics debugger tool. To be honest it was not friendly because when trying to capture the app frames did not work in the mac os app part however connection with mobile app was launched under render doc ok…And I desisted. Probably it should be good to get deeper in the configuration to use it. Some log warnings when using RenderDoc:

RDOC 046060: [23:00:46] streamio.cpp( 415) - Warning - Error reading from socket
RDOC 046060: [23:00:46] remote_server.cpp(1194) - Warning - Didn’t get proper handshake
RDOC 046060: [23:00:50] target_control.cpp( 831) - Log - Used API: Vulkan (Not presenting & supported)
RDOC 046060: [23:00:54] streamio.cpp( 415) - Warning - Error reading from socket
RDOC 046060: [23:00:54] remote_server.cpp(1194) - Warning - Didn’t get proper handshake
RDOC 046060: [23:01:02] streamio.cpp( 415) - Warning - Error reading from socket
RDOC 046060: [23:01:02] remote_server.cpp(1194) - Warning - Didn’t get proper handshake

Core PID 46060: [23:03:16] android_tools.cpp(328) - Log - COMMAND: /Users/Raul/Library/Android/sdk/platform-tools/adb ‘-s t8dupn5pskl7onwc shell getprop ro.build.version.sdk’

About memory allocation, I’m not using a sophisticated memory allocator, in my case same as Sascha Willems depth.image memory allocation:


namespace InitializersVk
{

	inline VkMemoryAllocateInfo memoryAllocateInfo()
	{
		VkMemoryAllocateInfo memAllocInfo {};
		memAllocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
		return memAllocInfo;
	}


//Memory allocation for depth.image
vkCreateImage(vulkanDevice, &imageInfo, nullptr, &depth.image);
VkMemoryAllocateInfo memAlloc = VR::InitializersVk::memoryAllocateInfo();
VkMemoryRequirements memReqs;
vkGetImageMemoryRequirements(vulkanDevice, depth.image, &memReqs);
memAlloc.allocationSize = memReqs.size;
memAlloc.memoryTypeIndex = ((CRenderVk *)CRenderGeneric::pGet())->GetDevice()->getMemoryType(memReqs.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
vkAllocateMemory(vulkanDevice, &memAlloc, nullptr, &depth.mem);
vkBindImageMemory(vulkanDevice, depth.image, depth.mem, 0)

Best

I ported the engine to be compatible with MoltenVK just to use xCode frame capture tool, it allows me to inspect and profile GPU for my applications in a stable way.

In my Vulkan renderer I receive from a PassInfo class each uniform element data. That is updated using vkMapMemory, vkUnmapMemory with the uniform device memory, data size/offsets in order to be correctly transferred from CPU to GPU. I was wondering what was the reason some data types were well transferred and others probably not. I discovered the following:

In my scene fragment shader uniform buffer was:

layout (set = 0, binding = 2) uniform UBO
{
mat4 mModelViewProj;
mat4 mWorldModelView;
mat4 mWorld;
float vCascadeSplits[SHADOW_MAP_CASCADE_COUNT];
mat4 vCascadeViewProjMat[SHADOW_MAP_CASCADE_CO UNT];
} ubo;

For each uniform element I mapped the data having:

SHADOW_MAP_CASCADE_COUNT = 4
Sizes:
64,64,64,16, 256
Offsets:
0,64,128,192,208

The first three matrixes were right but from float SHADOW_MAP_CASCADE_COUNT array for the cascade splits forward data was not correctly allocated. When I inspected it in the GPU frame capture tool I was surprised because every float was allocated as vec4. Each float array element was in the first position for each vec4, the other positions had matrixes data and subsequent uniform data was incorrect due to this unexpected allocation.

I fixed the problem declaring the cascade splits as … vec4 vCascadeSplits;: …

layout (set = 0, binding = 2) uniform UBO
{
mat4 mModelViewProj;
mat4 mWorldModelView;
mat4 mWorld;
vec4 vCascadeSplits;
mat4 vCascadeViewProjMat[SHADOW_MAP_CASCADE_COUNT];
} ubo;

Now,the Cascade Shadow Maps implementation looks pretty well 6)

You may be interested in:

which allows you to use the std430 layout on UBOs. Without it, I believe the default in Vulkan GLSL is std140. From GL_KHR_vulkan_glsl:

For more on these block layouts, see:

Dark_Photon, thanks for your quickly reply, it clarifies me the problem I had. Is quite well explained inside the block layout link you provided:

std140 : An array of floats in such a block will not be the equivalent to an array of floats in C/C++. The array stride (the bytes between array elements) is always rounded up to the size of a vec4 (ie: 16-bytes). So arrays will only match their C/C++ definitions if the type is a multiple of 16 bytes…

std430 : This layout works like std140, except with a few optimizations in the alignment and strides for arrays and structs of scalars and vector elements. Specifically, they are no longer rounded up to a multiple of 16 bytes. So an array of floats will match with a C++ array of floats.

Now I’ve a clear idea about my issue.