Finding the bottleneck in a simple ported application?

The following is a description of the problem that prompted this question, but I’ve solved it (silly mistake) and I’m more interested in discussing how to approach profiling / improving performance in Vk projects in general:

I ported a simple OpenGL application so that it can also run in Vulkan the other day. To test performance, I set it up to render the same single mesh, with 67,907 triangles per mesh, 1,600 times in an array across the screen, without using any indexinginstancing. I’m using simple Push Constants to push three transformation mat4s on Vulkan, and regular uniforms in OpenGL. The shaders are identical apart from slight adaptations to make them compatible with Vulkan (i.e. adding locations for out parameters on the VS, creating a push_constants structure for uniforms, etc.).

The OpenGL version runs at 20 FPS on an NVIDIA RTX 2070. The Vulkan version runs this same scene at 0.66 FPS. I’ve run both through NSight and I can’t seem to find exactly what the bottleneck is - all I know is that vkCmdDrawIndexed takes a lot longer than glDrawElements:

image

image

What’s the best way to work out the kinks in weird behavior like this? Not for this case in particular (…mainly because just while writing this I realized I’m using host coherent memory, and after fixing that and using device-local it runs slightly faster… -_-), but in general - how do I approach an issue like this when trying to optimize my Vulkan code?

Either you’re using indexed rendering or you’re not. Or by “without using any indexing”, you meant something other than indexed rendering. So what exactly is going on here?

Also, if you’re “the same single mesh”, why are you binding 1600 different VAOs in OpenGL and binding 1600 different vertex and index buffers in Vulkan?

Or by “without using any indexing”, you meant something other than indexed rendering.

I meant instancing, sorry!

why are you binding 1600 different VAOs in OpenGL and binding 1600 different vertex and index buffers in Vulkan

The bindings are just because of poor rendering code that binds repeatedly for each model in the frame. This is some thing I’ll fix later for this particular project.

I’m more curious about what to look for when I need to improve my Vk performance, in general - I’ve solved the specific issue that drove me to create this post and it was just a dumb oversight on my part.