Capture the screen with high performance

Mikael_Grev · September 15, 2009, 1:35pm

Hello there!

I’m a newbie at OpenGL (but an oldbie when it comes to programming) so I need a little help.

I want to capture the screen for use in a VNC like app. I’m a performance freak so the highest possible performance is what I aim for (I get 20 or 30ms to capture 1920x1200 with a simple glReadPixels, but I need more). I have read all post in this forum that contains Capture, though I still need a bit more information.

I need to capture, flip-y and scale the capture. The flip is because the next step needs 0, 0 to be top left and the scale is because, i guess, it is fast to do when the pixels are still in VRAM.

Is this the fastest route:

Init:

create a 2 x texture (is pow 2 still faster? It wastes a lot of VRAM…)

Every frame:

Copy the front buffer to texture#1 using glCopyTexSubImage2D().
Set some kind of transform that scales and flips the image and copy the texture to texture#2 (or a PBO?)
glReadPixels to DRAM. (or is it faster to map and memcopy it?)

Anything else?

I know I can performance test the cases but there’s a lot of different hw out there, as well as driver versions and no matter how much I read I can’t beat the experience of real world use you guys have… (For instance, it seems that nVidia is much faster when it comes to VRAM-DRAM than ATI).

Any pointers, comments or suggestions are much appreciated. Code even more so, if you have it.

Btw, a short q, can memory be copied VRAM->DRAM using DMA or is that just VRAM->VRAM? DMA is good since it free up the CPU for other stuff. It won’t reduce the latency but it will improve the throughput (both of which are important in a VNC app).

Cheers,
Mikael Grev

system · September 15, 2009, 2:49pm

One of the important things is to match the format of the back buffer and this format is usually GL_BGRA
1920x1200x32 bpp = 8.78 MB

It would be better to just play back the same GL commands on the other computer. Probably will cost less network traffic.

_x57 · September 16, 2009, 5:30am

Despite V-man’s reasonable point …

yes, transfer VRAM->DRAM can be asynchronous using pbo to first do a quick VRAM->VRAM copy and then map the VRAM to DRAM. There is a nvidia whitepaper describing some details that helped me … just google for “fast texture transfers”…

Also, you probably could perhaps use pixel transfer operations to do the scale/flip on the gpu, but i am not an expert with this.

Mikael_Grev · September 16, 2009, 12:33pm

Thanks guys,

I really would like to just transfer the GL commands but unfortunately I only work on pixels and OpenGL will only be used on Mac and possibly Linux.

A followup, is it possible to copy from the front buffer to a texture/PBO with scale+flip y in one go? Or do I need to go via an intermediate texture?

Cheers,
Mikael

system · September 16, 2009, 2:18pm

glReadBuffer(GL_FRONT)
but calling glReadPixels will not do scaling and flipping.

Scaling can be handled by calling glViewport, glEnable(GL_SCISSOR) and glScissor before rendering the scene.
Flipping can be done with a appropriate call to glScale on the modelview matrix.

There is also the GL_EXT_framebuffer_blit than can do scaling and flipping.