Debugging performance issues

I’m pretty new to OpenXR and VR development in general. Currently trying to integrate OpenXR support into a modelling/rendering application for “VR scene inspection” kind of features. The app is a desktop native C++ one, with Vulkan backend, and I also have a Quest 3 device. I run the app in it via Quest Link or Steam Link, setting the appropriate OpenXR runtime at will.

Following the spec and the OpenXR tutorial I managed to make a working integration, however it seems to be suffering from some performance issues. Enabling the oculus debug tool (when using Meta’s OpenXR runtime) and looking at all the timings and graphs make me feel a bit overwhelmed, since I don’t know where to look first and how to interpret everything and put priorities (not to mention SteamVR’s graphs which look even more intimidating :slight_smile: ).

So I’m looking for documentation/articles/guides/best practices that I have missed in order to pinpoint and solve these kind of problems. Since it’s a native app I don’t have any unity or unreal tools available.

To be more specific, for example, one thing I notice is the “App missed submit count” counter grows like crazy. I can find some info explaining that it “Increments each time the application fails to submit a new set of layers before the compositor is executed and before each V-Sync (Vertical Synchronization)” but, in practice, what could be the possible steps towards solving it? Can this also be considered acceptable/normal to some degree?

Another example, profiling with Visual Studio 2019, it shows that a lot of time is spent on xrWaitFrame(). I realize that it blocks the app “in order to synchronize application frame submissions with the display” so it explains the overhead. But is it actually a problem, considering that the rendering is done after the xrBeginFrame()? I don’t understand the benefits of delegating xrWaitFrame() to another thread (as is advised by the OpenXR tutorial - although not implemented in it) since we’ll have to wait anyway before calling xrBeginFrame() and how “mandatory” it is for improved performance.

Thanks in advance for your time, I realize that I don’t understand plenty of things fully yet, so any pointers or directions are more than welcome.

There seems to be a lack of good performance documentation. I haven’t really seen anything good at least.

The key to profiling VR performance is inspecting what happens between the calls to xrWaitFrame and making sure it all takes less time than a single frame. If the runtime detects that the application isn’t keeping up with the frame rate, it many start throttling it, and you will see much longer wait times on xrWaitFrame. One thing to check is that the application isn’t using vsync when submitting desktop windows. Since the VR and monitor frames aren’t synchronized and may have different frame rates, it may cause the app to block for long times.

It gets easier to profile with a call tree timeline. I don’t know there is any available though. Visual Studio has a plugin called Concurrency Visualizer that shows a timeline of thread workload. It might help determine how the runtime interacts with the application threads and see where it gets blocked.

AFAIK the point of running xrWaitFrame in a different thread is for pipelining. If I’ve understood it right, if you put it in the main “game” thread, it allows you to wait on it and start processing the next frame immediately when the render thread starts rendering the current frame.