Swapbuffers priority

macarter · April 14, 2006, 8:52am

We have windows XP multi-threaded application with the drawing thread running at a real-time priority level. With Nvidia drivers the buffer swap will not occur until a CPU is available that is not running a real-time thread. This is not true with ATI drivers. It therefore appears that the Nvidia buffer swap function does not run at or above the priority level of the calling thread but at some reduced priority level. I have tested with 77.77 81.98 and the 84.21 driver versions with and without vsync enabled on a GeForce 7800 GTX with no difference in behavior. Can anyone confirm our observations? Is there a way to boost the buffer swap priority? Do the Nvidia linux drivers also have this characteristic?

ZbuffeR · April 14, 2006, 9:15am

:eek: to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?

Under linux, the window manager has even lower priority :
“You can lock yourself out of the system by placing a cpu-heavy process in a realtime priority.”
… and only root can do that.

Anyway I am not from nvidia so won’t be able to help more…

ccbrianf · April 17, 2006, 3:05pm

Originally posted by ZbuffeR:
[b] :eek: to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?[/b]
Yes! Trust macarter, he knows what he is doing. In our business, a missed frame means a lost customer.

Chuck0 · April 17, 2006, 3:26pm

Originally posted by ccbrianf:
[b] [quote]Originally posted by ZbuffeR:
[b] :eek: to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?[/b]
Yes! Trust macarter, he knows what he is doing. In our business, a missed frame means a lost customer. [/b][/QUOTE]well then you must have lost all nvidia customers since they loose quite some frames.
i mean setting a thread to realtime on a system that is only running your application is quite an overkill…
What do you gain by doing this? especially if you are working on multi core/processor systems. If the operating system dares to scedule another system process instead of yours this will hardly cost you any frames.

Overmind · April 17, 2006, 3:37pm

In our business, a missed frame means a lost customer.
And what does this have to do with setting the rendering thread to realtime priority? Just setting the thread to the highest priority available is not going to help with application performance, it is most likely conterproductive (as you already noticed…).

def · April 18, 2006, 7:17am

You would have to somehow change the priority of the driver thread…
But I have been working in the broadcast video business for a while and there was never a need or practical reason to run in anything but standard priority. ( using dual cpu machines with WinXP )

ccbrianf · April 18, 2006, 8:41am

Originally posted by Chuck0:
[b] [quote]Originally posted by ccbrianf:
[b] [quote]Originally posted by ZbuffeR:
[b] :eek: to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?[/b]
Yes! Trust macarter, he knows what he is doing. In our business, a missed frame means a lost customer. [/b][/QUOTE]i mean setting a thread to realtime on a system that is only running your application is quite an overkill…[/b]

A system is never ONLY running your application. There are LOTS of processes running in an otherwise idle system. All of them can and do interfere with an application running without realtime priority.

What do you gain by doing this? especially if you are working on multi core/processor systems.

An application can have more than just a rendering thread! And, in order for your theory to work, you must have one core per possibly concurrently running thread. My idle XP system has 20+ processes “running”. Not all of them need to run at the same time, but that’s still a lot of cores to assure there is no competition with a rendering application.

If the operating system dares to scedule another system process instead of yours this will hardly cost you any frames. [/QUOTE]I disagree. It will cost many over the course of an application run.

ccbrianf · April 18, 2006, 8:43am

Originally posted by Overmind:
[quote]In our business, a missed frame means a lost customer.
And what does this have to do with setting the rendering thread to realtime priority? Just setting the thread to the highest priority available is not going to help with application performance, it is most likely conterproductive (as you already noticed…). [/QUOTE]The point is not to increase rendering application performance (obviously priority can’t make code more efficient), but to keep other applications from interfering with the rendering application.

ccbrianf · April 18, 2006, 8:45am

Originally posted by def:
You would have to somehow change the priority of the driver thread…
But I have been working in the broadcast video business for a while and there was never a need or practical reason to run in anything but standard priority. ( using dual cpu machines with WinXP )
Are you rendering smooth motion video without stepping using this technique?

Overmind · April 18, 2006, 9:31am

Originally posted by ccbrianf:
The point is not to increase rendering application performance (obviously priority can’t make code more efficient), but to keep other applications from interfering with the rendering application.
Of course I was talking about realtime performance, not rendering performance. Sorry if I wasn’t clear about that.

Under normal circumstances, the highest non-realtime priority is enough for what you want. With normal circumstances I mean that no application is running that abuses thread priority. And if you want soft realtime performance, you better make sure no such application is running on the same machine, otherwise you won’t have a chance with realtime threads either.

Threads with realtime priority have only a very limited special use, it is definitely not meant for a rendering loop, or any other permanently running loop, but only for short tasks with low latency. Switching the rendering task to realtime definitely falls into the category “fine if it works, but expect it to break with the next driver release, service pack, …”.

The problem with realtime priority is that you’re giving your application a priority that’s not only higher than the other applications, but also higher than most operating system services. And this can lead to all sorts of bad effects if you don’t voluntarily limit your own CPU use. The swapbuffers deadlock you experienced is a typical example for this kind of problem.

ccbrianf · April 18, 2006, 10:18am

Originally posted by Overmind:
Under normal circumstances, the highest non-realtime priority is enough for what you want. With normal circumstances I mean that no application is running that abuses thread priority. And if you want soft realtime performance, you better make sure no such application is running on the same machine, otherwise you won’t have a chance with realtime threads either.

Agreed.

Threads with realtime priority have only a very limited special use, it is definitely not meant for a rendering loop, or any other permanently running loop, but only for short tasks with low latency.

When a rendering loop is scheduled and coded properly, it is a short task that requires low latency ;-), once per frame.

Switching the rendering task to realtime definitely falls into the category “fine if it works, but expect it to break with the next driver release, service pack, …”.

Yes. Soft realtime programmers are all too familiar with a (Solaris, etc.) kernel patch, driver update, etc. breaking realtime behavoir. That still doesn’t mean the expectation and usage is incorrect. It just means it is not a quality testing priority.

The problem with realtime priority is that you’re giving your application a priority that’s not only higher than the other applications, but also higher than most operating system services. And this can lead to all sorts of bad effects if you don’t voluntarily limit your own CPU use.

Yep. And the application has to be designed with this in mind.

The swapbuffers deadlock you experienced is a typical example for this kind of problem.
Unfortunately so :-(. One would still expect that swapbuffers should accomodate at least a high priority timeshare application. It appears to not even do that correctly. The swap appears to happen at HIGH_PRIORITY_CLASS, THREAD_PRIORITY_NORMAL.

def · April 19, 2006, 2:23am

Originally posted by ccbrianf:
Are you rendering smooth motion video without stepping using this technique?
Yes I do, but to be honest, there are no other applications running and we disable quite a few unneeded services in Windows to make it as sleek as possible.

Overmind · April 19, 2006, 4:09am

When a rendering loop is scheduled and coded properly, it is a short task that requires low latency ;-), once per frame.
Then code the rendering loop such that it renders the frame and then sends a signal to a lower priority thread that does the swapbuffer and let it wait until it gets the “I’m done” signal back.

If you include the swapbuffers in the realtime rendering loop, it may never yield the CPU. I guess the driver has some work to do on swapbuffers that should not interrupt realtime applications (in the general case), while swapbuffers does some form of busy wait to reduce latency (which is required for vsync). With the communication to a low priority thread, you’re basically explicitly telling the driver “you may interrupt me now”.

Unfortunately so :-(. One would still expect that swapbuffers should accomodate at least a high priority timeshare application.
I’d expect swapbuffers to have a high priority, but not neccesarily the highest. No device driver should have realtime priority per default, this priority is reserved for applications that need to have a priority that is higher than anything else.

ccbrianf · April 19, 2006, 9:25am

Originally posted by Overmind:
Then code the rendering loop such that it renders the frame and then sends a signal to a lower priority thread that does the swapbuffer and let it wait until it gets the “I’m done” signal back. If you include the swapbuffers in the realtime rendering loop, it may never yield the CPU.

It does wait. That’s not the problem.

There are other realtime threads in the application with lower realtime priorities than the rendering loop. Those other threads are interfering with the high priority realtime rendering thread’s swapbuffers.

I guess the driver has some work to do on swapbuffers that should not interrupt realtime applications (in the general case),

That may be the case, but I think that may be giving too much credit to the design. It’s quite possible that this situation was just not considered. We’d like it to at least be configurable, or to run at the highest thread priority that calls swapbuffers.

while swapbuffers does some form of busy wait to reduce latency (which is required for vsync).

Not in our application. To reduce latency, we don’t let frames pile up enough to spin in swapbuffers. Our frame scheduler is fairly sophistocated.

[quote]Unfortunately so :-(. One would still expect that swapbuffers should accomodate at least a high priority timeshare application.
I’d expect swapbuffers to have a high priority, but not neccesarily the highest. No device driver should have realtime priority per default, this priority is reserved for applications that need to have a priority that is higher than anything else. [/QUOTE]Agreed, but see above. It should have the highest priority of the calling threads it is servicing.

ccbrianf · April 19, 2006, 9:26am

Originally posted by def:
[quote]Originally posted by ccbrianf:
Are you rendering smooth motion video without stepping using this technique?
Yes I do, but to be honest, there are no other applications running and we disable quite a few unneeded services in Windows to make it as sleek as possible. [/QUOTE]You wouldn’t care to share a pointer to this information (tweaks, disabled services, etc.), would you?

def · April 19, 2006, 10:37am

Originally posted by ccbrianf:
You wouldn’t care to share a pointer to this information (tweaks, disabled services, etc.), would you?
Nothing special really, just going through all services and deciding what is really actually needed. We have a memory footprint of about 95 Mb for WindowsXP , without networking we would be even lower.
But I am not saying we had problems without disabeling…

ccbrianf · April 19, 2006, 1:09pm

Originally posted by def:
[quote]Originally posted by ccbrianf:
You wouldn’t care to share a pointer to this information (tweaks, disabled services, etc.), would you?
Nothing special really, just going through all services and deciding what is really actually needed. We have a memory footprint of about 95 Mb for WindowsXP , without networking we would be even lower.
But I am not saying we had problems without disabeling… [/QUOTE]We’re somewhat Windows illiterate here (darn UNIX realtime programmers), so I was hoping for an easy out. Oh, well… Thanks.

knackered · April 19, 2006, 1:54pm

I’ve done realtime video playing too - didn’t need to touch the thread priority. If you’re missing frames I would suggest your sequencing is wrong. (by missing frames I mean not rendering a frame at consistant monitor refreshes)
Maybe you’re doing something like this:-

void renderLoop()
{
if (timeToRenderFrame)
{
uploadTexture();
swap();
}
}

…when really you should be doing this:-

void renderLoop()
{
if (timeToRenderFrame)
{
swap();
uploadTexture();
}
}

Don’t take this too literally, it’s only an example of what I mean by a sequencing mistake.

evanGLizr · April 19, 2006, 2:39pm

Well, frankly speaking, if you are using REALTIME_PRIORITY, you get what you deserve:

When manipulating priorities, be very careful to ensure that a high-priority thread does not consume all of the available CPU time. A thread with a base priority level above 11 interferes with the normal operation of the operating system. Using REALTIME_PRIORITY_CLASS may cause disk caches to not flush, hang the mouse, and so on.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/setthreadpriority.asp

Windows is not a realtime system, if this is the only problem you’ve found, consider yourself lucky and expect many more due to priority inversion.

knackered · April 19, 2006, 3:40pm

he obviously hasn’t run it on a single processor machine.