Where in the pipeline does NVidia's "Low Latency Mode" kick in?

Dark_Photon · October 26, 2020, 1:55pm

This setting definitely affects OpenGL programs, but I can’t speak to DX.

As to whether the engine would set this up or not… There isn’t a GL/WGL/GLX API (AFAIK) that controls this behavior. It’s up in the realm of “driver settings” state. So for an engine to set this up, it would have to be monkeying with the driver settings. While accessible from NVCP, you can also get to this setting via NVAPI. See PRERENDERLIMIT_ID.

You don’t. It’s created implicitly for you in the driver when you create a GL context.

Look at how Vulkan manages GPU interaction for clues

A stock OpenGL app is provided some visibility into this driver behavior (up until the swap chain insertion) via Timer Queries and Sync Objects

The Max Pre-rendered Frames setting basically sets the max number of queued SwapBuffers() calls that are allowed to exist in this queue at once. This is the “length of the rope” the app gives the GPU (with the GPU driver trailing along behind the app). If the GPU (back-end driver) gets too far behind, the app has to wait for it to catch up before moving on. That “wait” happens in the front-end driver, and causes the app CPU submission thread to block.

glFinish() should result in the app being put to sleep until all commands in this “Pre-render queue” have been executed by the GPU/driver. Basically after returning from this, the queue should be empty.