Im trying to catch a screenshot from a OpenGL game.
I already have the function working, with glReadPixels, but it is really slow. This causes lag to the game, so I was woundering if I could start a Thread to take the screenshot… What funcions should I use? (wglSwapBuffers? glCopyContext/glCreateContext?) and would it prevent the lag?
Can you explain the term “slow” in millisecond?
You can’t use another thread in a easy way or without using a large amount of memory. But let’s start from the beginning.
Reading from the GPU is a slow operation cause you have to wait all the command in the queue before read the framebuffer, also the framebuffer is quite big and you have to transfer an huge amount of data from the video memory to the client memory. It’s slow but no so slow… also you usually don’t grab a screenshot every frame. What do you do after you read the data? If you save the texture in the disk in a compressed format this could really slow down your application, probably the data management should be on another thread (once you have the data you don’t need openGL command anymore).
glReadPixels is relatively slow but it is not that bad these days. Really make sure you use the right download format (typically BGRA) and make sure you use pixel buffer objects. Downloading of an image at lets say 720p takes about 3ms these days on Nvidia.
Just the use of pixel buffer objects already makes it at least 2x faster (the drivers can use very efficient memory for this) even if you just follow it by a glMapBuffer.
What you could do is to perform the glReadPixels from the main thread (the one which calls *SwapBuffers). Right after the read create a GL sync object and watch for that sync object (e.g. by polling if you don’t spawn an extra thread). Once the sync object enters the signalled state you know that the download is ready and you can call glMapBuffer without blocking. You could also just watch for the sync object from another thread (it would need a context though).
Hi, thank you for your awnsers.
Rosario Leonardi it was taking like 1second.
RoderickC yes, I wasnt doing it right:
char *image = new char[w * h * 3];
glReadPixels( 0, 0, w, h, 0x80E0, GL_UNSIGNED_BYTE, image );
GLubyte *image = new GLubyte [nSize];
glReadPixels( 0, 0, w, h, GL_RGB, GL_UNSIGNED_BYTE, image );
and its working fine now.
Typically using GL_RGB won’t give you optimal performance as the backbuffer (even if you have just a “24 bit” backbuffer) is actually 32 bits: 8 R, 8 G, 8 B and 8 unused. So what happens if you use GL_RGB is that your glReadPixels will need to read the full 32-bits, then copy them to another 24-bit buffer, and because CPUs don’t handle 24-bits natively it may even need to copy one byte at a time.
You want to be using GL_BGRA as your format in other words (this will avoid an in-software swizzle from the native backbuffer layout to your in-memory layout, and is also good for uploading to a texture or writing out as a TGA) and GL_UNSIGNED_INT_8_8_8_8_REV as your type (which provides a hint to OpenGL to transfer 4 bytes at a time instead of one byte at a time - although this normally won’t be the actual bottleneck, it won’t hurt either).
Only if those don’t operate at sufficient speed should you be considering use of a pbuffer or extra threads.
Ignore anything anyone might say about PCIe bandwidth from VRAM to system memory by the way; the full bandwidth may well be there, but it still needs to stall the pipeline before it can do it’s thing.