Just because you sent a frame of data doesn’t mean that the card is finished with it. Likely, much of it is sitting in a buffer somewhere while the GPU is working on some amount of data. When the GPU gets freed up, it uploads some more data. This permits the card to run asynchronously.
Your call to glReadPixels forces the driver to wait until the scene is fully rendered before reading the pixels. Otherwise, you will get a half-rendered scene.
Try waiting and doing something else useful, then call glReadPixels later when the GPU might be finished.
Oh, so the “with a shader” part was what was apparently causing the slowdown? So glReadPixels works (relatively) OK if you’re not using a shader?
Hmmm. That’s an odd thing. Though, I would point out that your drivers for the 6800 are probably still betas (or perhaps not even designed for the 6800 if you’re using public nVidia drivers). As such, they could have all kinds of odd issues.