So a friend of mine was asking for a good way to capture real-time rendered scenes to an avi file or some other video file, and was having an almost impossible time with it using any conventional video-capture program out there. The programs would capture at about 4 frames per second, no more. I attribute this to the fact that rendering the scene in the first place is hard enough on the system resources, let alone adding in simultaneous writing to the hard drive of each and every frame, plus audio.
So I started trying to think up a better way to capture the rendering, and here’s an idea that I came up with, but I’m turning to you all to see if it is feasible/possible:
Instead of using the standard rendering device, instead use a Virtual Rendering Device that wrote to the hard drive instead of the monitor. The rendering device would also need to record each and every frame (to a certain limit) without dropping any of them, so there would have to be some way to make sure it captures these frames, no matter how long it takes (instead of rendering 30 frames per second, it might end up rendering 30 frames per 5 seconds or something) but make sure that the video it records plays the frames back at the desired resolution.
In my mind, it doesn’t sound like it would be that hard to do, but I’d just like to see what you all think about it.
The biggest problem, it seems to me, is that I don’t have the source code for said real-time rendered scene, so I would have to develop the device to work within a pre-existing code framework and functionality.
What do you think?
To do a correct job of capturing the data being rendered, rather than the screen, you’d need to copy all the texture data being created/bound, and all the vertex array data. This is likely to be larger than just capturing the frames themselves.
I would do something like GLTrace, which can insert itself and intercept calls to GL. Then I’d put something into wglSwapBuffers() which reads back the pixels, and queues an asynchronous (overlapped) write to disk; alternately, marks the data as writable and wakes up a real-time writing thread. The card can then go on working on the next frame while the previous is being written to disk.
Note that at 30 fps, 640x480x32 bits is 35 MB/s, so you’d need a pretty cutting-edge RAID array to actually write that to disk. Also, calling ReadPixels() means that the GPU will be stalled, waiting to complete the frame, before you can get the pixels. Anyway, on beefy hardware, this might work. You can always compress the video later, off-line.
Another way to do it is to use the TV/S-video out on your computer and record to tape. Then use a separate video capture program to read it back in to your computer. A FireWire card is only $49, assuming you already have a DV camera with S-video in. The quality will be reasonably good, though not pixel perfect. Because TV is interlaced, the data throughput is more like 20 MB/s, and the DV compression takes that down to about 4 MB/s, which most hard disks can deal with these days.
I would imagine it would be far faster to compress each frame in realtime before writing to disk, rather than writing that huge amount of data to disk each frame.
Actually, and expansion of jwattes GLtrace idea - why not capture all the gl calls directly, and record them instead. Then, after the runtime has finished, you’ve got all the time in the world to reconstruct each frame from the gl calls, and write the final frames to disk.
This is just an extension of my own method of recording just the scene interactions (joystick/mouse movements, AI events etc.) and re-playing the scene after runtime and recording ‘offline’ so to speak.
The whole point was that he wanted the CPU for running something at full speed and not take the CPU hit of (de-interlace) and compression.
Also, as I already mention in my post, if you record the full set of commands with data (this includes texture data) you’re likely to have to write even more data to disk per frame. To estimate the maths:
80,000 tris per frame == 120,000 verts per frame. 1 vert == 24 bytes, 1 index == 2 bytes. Sum total of mesh data: 24120,000+380,0002 == 2880000+480000 == 3360000 bytes. And we haven’t started recording texture uploads yet. Meanwhile, 640480*4 is only 1228800 (921600 if you drop the alpha).
I stand by the recommendation of doign real-time capture to an external device (could be another machine if you have it handy