Reading the framebuffer - fast

You can get the ATI OpenGL headers at:
http://www.ati.com/na/pages/resource_centre/dev_rel/sdk/RadeonSDK/Html/Info/Prog3D.html

I sent them a e-mail about the project to see if they where interested? Have not got a reply yet…

Originally posted by cix>foo:
[b]and while I’m here, where do I get the headers and extension specs for ATI’s drivers, anyone?

Cas [/b]

Unless OpenGL has some mechanism to access the framebuffer directly, there is no way to read it “fast”, this has to be done in software unless NVIDIA or ATI wants to add some special hardware to copy frame buffer contents to system memory. Even in that case, transferring from video to system memory would be the bottleneck.

I am sure that this is one of the biggest limitations of OpenGL, if you want to really read back the frame buffer or perform some operations on it, use DirectX instead… or wait for OpenGL 2.5 or sumthin, hope the ARB is looking at this is one of the most important needs of OpenGL…

-Sundar

Hi,
5d’s Cyborg system use Wildcat for OpenGL rendering and DVS’s SDstation for realtime
video output. As far as I know DVS (www.dvs.de)offer API for Win/Linux/IRIX and
a quality of the SDI board itself is exellent. Bit pricy though (~5K UK Pounds).

5d do exactly what you want to do whith their Cyborg. They use win2k,opengl and dvs
board. You can see on the broadcast monitor what is going on on you computer monitor with realtime update.

farid.

cix, what kind of memcopy are you doing? 32-bit?

I think that you can get a speed increase if you do 64-bit transfers (cast to double precision float vectors). Isn’t there a “block move” instruction in the x86 instruction set? I think the Motorola 68040 had a MOVE16 instruction (move 16 32-bit words). [Come to think of it, even on the 68000 you could load or store 16 32-bit words with one instruction]

Otherwise SIMD instructions could even be faster (?), since they use 128-bit registers. Not sure, I have not benchmarked these things.

On the 68000, you didn’t really copy blocks with the 32-bit move like:

loop:
move.l (a0)+,(a1)+;
dbf d0,loop;

Instead, you did:

loop:
movem.l (a6)+,d1-d7/a0-a4;
movem.l d1-d7/a0-a4,(a5);
lea 48(a5),a5;
dbf d0,loop;

If you were clever, you could even use the a7 register (well, that was the stack so you needed to be careful with the ssp and interrupts if you were in supervisor mode !)…

Actually, on the Atari ST and Amigas, you used to pre-generate the code with loads of movem to save on the dbf… On the Atari Falcon 030, you could leave the dbf and keep the loop in the cache.

These were great days !

Regards.

Eric

Originally posted by Eric:
[b]

[quote]

loop:
movem.l (a6)+,d1-d7/a0-a4;
movem.l d1-d7/a0-a4,(a5);
lea 48(a5),a5;
dbf d0,loop;

[/b][/QUOTE]

Yes, that was what I meant. It could also be used for clearing memory - I think it was even faster than the hardware blitter on the Amiga 500.

[b]

These were great days ! [/b]

…when you could actually understand assembly language, 16 32-bit GPRs, flat memory addressing, memory mapped I/O, etc, etc! Someone should be punished for the x86 ISA (not just us coders).