The video output bandwidth, i.e. for moving pixel data from the front buffer to the display, is probably separate from the rendering bandwidth, i.e. writing pixels to the back buffer. At least that’s what I would expect from good, high performance double buffer hardware.
On the cards I have tried, configuring the desktop for a simple vertical or horizontal span over two identical monitors gives comparable performance to a single display, while using the NVidia feature DualView can give me a drop in performance to about 50%, at least if I run at higher resolutions. My guess is that the DualView driver renders the screen image to a separate pixel buffer and then copies the pixel data to the actual display buffer.
Regarding the teapot vs sphere performance:
If you have a look at my code, you can see that I was very lazy when I wrote the teapot rendering code. I just cut and pasted the GLUT code, which uses glEvalMesh() to render quads from a bicubic patch description. Unfortunately, the API entry glEvalMesh() is not hardware accelerated at all on most existing OpenGL implementations, so the teapot model is CPU limited. I tried creating a display list for the teapot, but the display list captures the glEvalMesh() call as such, it doesn’t expand it into separate hardware-accelerated triangles.
Originally posted by Aeluned:
[b]I don’t have the technical explanation for this,
but I’ve found that disabling the secondary display improves performance.
Naturally there must be some overhead in configuring the video card to output to dual displays.[/b]
I’ve seen the same thing. With Doom 3 and some OpenGL demos from nVidia like Dawn or Dusk the frame rate on my system is really low but if I disable the secondary display it’s multiplied by 10.
Haven’t seen the same thing with DirectX applications.
Originally posted by StefanG:
[b]The video output bandwidth, i.e. for moving pixel data from the front buffer to the display, is probably separate from the rendering bandwidth, i.e. writing pixels to the back buffer. At least that’s what I would expect from good, high performance double buffer hardware.
On the cards I have tried, configuring the desktop for a simple vertical or horizontal span over two identical monitors gives comparable performance to a single display, while using the NVidia feature DualView can give me a drop in performance to about 50%, at least if I run at higher resolutions. My guess is that the DualView driver renders the screen image to a separate pixel buffer and then copies the pixel data to the actual display buffer.
[/b]
“high performance double buffer hardware” went away several years ago. Video cards now have a unified memory which stores front and back buffers AND textures (along with other stuff) all in the same memory. Thus, the display refresh may impact rendering performance if the GPU is memory bandwidth limited.
I tried running in horizontal span mode and the sphere render rate went up to 1314 fps! That is better than with a single display.
This is really weird. I hope it is just a driver bug so that it can be fixed. Dualview and Horizontal Span both display two screens at 1280x1024 (in my case). Dualview renders to two independent windows of 1280x1024 each, while Horizontal Span renders to a single 2560x1024 window. The same number of total pixels, just different window settings. I would think that a fragment program limited application should run about the same rate in either mode. But they don’t, so your theory of nVidia doing a pixel copy seems to make sense, although I can’t image why they would do that.
The noise shader code in the zip file has been updated somewhat, in response to a comment. I did a silly mistake and sampled the texture right at the edge between texels, which gave some visual glitches. The correct half-texel offset is in there now.
Also, the gradient texture used only the least significant bits and was basically black in RGB, which is a potential problem if texture compression kicks in and fudges small texel-to-texel differences. I scaled the values up to make them more robust. Nobody has reported a problem with this, but it seemed more safe.
I also did a rewrite of the comment header in the fragment shader to include some reference to your encouraging benchmarks.
Originally posted by hdg:
“high performance double buffer hardware” went away several years ago. Video cards now have a unified memory which stores front and back buffers AND textures (along with other stuff) all in the same memory. Thus, the display refresh may impact rendering performance if the GPU is memory bandwidth limited.
It should not effect much since current memory has bandwidth in the gigabytes/sec.
I think 3dlabs still sells hardware with separate framebuffer and texture memory (and the texture memory is fully virtualized, so they page in only what’s needed).
Anyway, the point is good: the Radeon 9700 Pro came out over two years ago, and had 20 GB/s memory bandwidth at the time. It’s doubled since then, more or less (GF 6800 Ultra is rated at 35 GB/s).