Reading the framebuffer - fast

I went back to double check some spec’s on the ATI Radeon. It has a built in vedio overlay system(genlock), with multiple video output options. So in real time you can render your 3D graphics over a incoming video signal. With both PAL/NTSC support.

They also have a SDK for video editing and play back which look’s like it could give you a good starting point.

Originally posted by cix>foo:
[b]The Wildcat’s sadly useless as it costs £2,500 and doesn’t have a separate linear key output, so we’d probably just be better off using a cheapo Geforce and doing the framebuffer copy to a genlockable PCI card. Which is what we’re attempting.

Cas [/b]

[This message has been edited by nexusone (edited 02-18-2002).]

Unfortunately we can’t take an input; we’re only allowed output. This ain’t cable ya know

Cas

From this post I am lost at what you are trying to do.
If you want to do something with a gen-lock type feature, normaly it is to mix two diffent video signals together.
If you don’t need a video input, then there is no reason to use two video cards, since you are just outputing a video image.

Originally posted by cix>foo:
[b]Unfortunately we can’t take an input; we’re only allowed output. This ain’t cable ya know

Cas [/b]

Here is the information on the ATI video processing features: (Genlock, etc.)

http://www.ati.com/na/pages/resource_centre/dev_rel/atirdv.pdf

nexusone
TV is a bit different. It’s not a matter of sync 2 video sources, but outputing your video in sync with a timing pulse that is common to all station devices. The sync is for frame timing, so you don’t get roll, and colour so that your colours match.

cix>foo
Have you tried a video overlay card and a 3D card and doing like to old 3DFX cards used to do. Take the output from the 3D card and plug it into the input on the overlay card. It’ll be analogue but it might work.

I understand how video sync works!
What I am now confused by is what he is trying to do.

If you want sync, alpha or any other signal, you could access it right off the video processing chip on the ATI.

Say on the ATI chip set, they use a second chip to preform the PAL/NTSC/SECAM conversion. Just grab the signal before this chip. I don’t have the spec’s, but from what I have seen the chip has the following: RGBA, YUV listed so you could grab these at the chip. Also you can get the H-sync and V-sync there also.

Another feature of the ATI card’s gen-lock is the color filtering, it does process both sources color information to make sure the output of the combined sources is correct.

Also since all the video timming information came be programed in the ATI’s video hardware, I don’t see a ext. sync source would not be a problem.

Originally posted by henryj:
[b]nexusone
TV is a bit different. It’s not a matter of sync 2 video sources, but outputing your video in sync with a timing pulse that is common to all station devices. The sync is for frame timing, so you don’t get roll, and colour so that your colours match.

cix>foo
Have you tried a video overlay card and a 3D card and doing like to old 3DFX cards used to do. Take the output from the 3D card and plug it into the input on the overlay card. It’ll be analogue but it might work.[/b]

cix,

I understand why you’d think the ATI consumer input converters aren’t broadcast quality, but they’d be good enough to hitch a ride on house blackburst, no? Assuming the consumer TV out converters are at all useable, that is.

Set a high black level on your GL image, then go crazy on outboard mixing gear as much as you want :slight_smile:

Originally posted by cix>foo:
[b]The Wildcat’s sadly useless as it costs £2,500 and doesn’t have a separate linear key output, so we’d probably just be better off using a cheapo Geforce and doing the framebuffer copy to a genlockable PCI card. Which is what we’re attempting.

Cas [/b]

Could you please recommend any cheap video cards for this?

– Niels

Quite an interesting topic this one eh?
We’re attempting to use a Matrox Digisuite for the output. It’s got SDI and YUV output, separate linear key from the alpha buffer, and genlockable. It’s not a great solution because we’re going to lose maybe 10ms rendering time copying the framebuffer about. It may not work yet - I start on that tomorrow. Buggers charged us £700 for the SDK as well, which is just bloody criminal of Matrox. We only want one function call.

As for ATI - well, if anyone knows exactly how to put a straight old 50hz genlock sync into it, and get YUV out of it so we don’t have to scan convert, and a separate output which is (preferably) the framebuffer alpha but otherwise could be the dualhead display output, without a degree in electronics, should mail me directly. If you make it work you will have earned yourself a lot of money; trouble is you have maybe 2 weeks to get a working board to me in Guildford.

We tried a luminance key but, well, black is black and when you get down to the dark reds the mixing goes into noise. Besides we need a linear key so we can make bits more transparent than others, not a luminance key.

Anyone who doesn’t know exactly what we use genlock for in broadcast, know this: we use it to sync the entire studio, not just one card and an incoming picture. The entire studio goes through one big Sony digital mixer. We supply two graphic feeds: one for “supers” - that’s the transparent overlays we need the keys for really, such as captions; and one for fullforms, which take up the whole screen.

There’s no way they’d entrust our crappy PCs to mix the output of a live programme; all our remit is, is to produce graphics and transparency keys, and that’s it.

Cas

oh yeah; if I tried to put the TV out from the ATI on air I’d probably be shot by the producer

BTW the Digisuite costs about £3500 for the SDI version I think. In other words, waaaay too expensive. But we’re running out of choices.

Cas

What about the ATI S-Video output or the Digital video output?

You say you have talked to Matrox, how about to ATI? They are supporting open source, by providing tech data for the people writing Linux drivers for their cards.

I bet you would get their SDK without any charge, I am sure they are open to push into other areas.

Originally posted by cix>foo:
[b]oh yeah; if I tried to put the TV out from the ATI on air I’d probably be shot by the producer

BTW the Digisuite costs about £3500 for the SDI version I think. In other words, waaaay too expensive. But we’re running out of choices.

Cas [/b]

[This message has been edited by nexusone (edited 02-19-2002).]

I completely agree, don’t use readPixels and don’t use memcpy. what you want to do is set up a 32 mem move. doing 32 bits a move instead of 8. That in itself is a huge bump in speed. I used to have a 32 memcpy embedded in my old 16 bit dos apps. Those were fun days. basically set up (forgive me if this is wrong) edi and esi as dest and source. setup for forward copying and rep mov the data. Oh yeah i believe its ecx that gets the size (number of 32 bit longs). Something like that I dont’ remember the exact x86 assem code. But you can find the reference on intels web site.

Devulon,

The implementation of memcpy() uses rep movsd, which is an “accelerated” 32 bit move.

Unfortunately, it will pollute your cache if you’re working with cacheable memory, but when it comes to frame buffers, it’s pretty much as efficient as you can get. Intel has special hardware in their chips to make it Do The Right Thing ™ in that case.

I believe the main problem is that cix has very specific output signal needs, and there aren’t any cheap cards around that fulfill these needs. In general, the difference between “pro” and “consumer” gear is OFTEN mostly in the connectors, and the level of care taken to implement things like buffer drivers and stuff. Witness XLR vs 3.5-millimeter plugs for sound cards as an obvious example.

He thinks $3500 is too expensive for a Digisuite? Well, that would be true, if there’s actually some piece of hardware that does the same thing, cheaper. Haven’t seen it yet :slight_smile:

Bad news for frame buffer copies; it ain’t fast enough: 17fps to the Digisuite and that’s with no rendering…

Good news for all else concerned: hidden away in Matrox’s product lineup is the CG2000. This has every conceivably useful output, costs not too much, and plugs directly into a G450 using a funny little ribbon cable.

More news on this when we get our hands on one, probably Monday. Stay tuned.

I hope Matrox’s drivers aren’t still ****.

Cas

I may be way off here (not sure how the CPU/PCI/AGP buses cooperate/conflict), but wouldn’t it be possible to parallellize the copying, so that reading and writing can run more or less simoultaneously?

Idea 1: 1 CPU, 2 threads
Thread 1 - Read from GL card
Thread 2 - Write to display card

If we have 2 frame buffers in “RAM” (the in-between-buffer"), it would be possible for the GL card to write to one buffer while the other thread reads from the other (old) buffer. (that is, if you can afford 1 frame of delay).

Possible problem 1: don’t know if the CPU is free while doing glReadPixels (maybe it is on the Radeon, but not on the GeForce?). Can maybe be solved with a dual CPU board?

Possible problem 2: the buses may be choked already by only reading or writing, so doing both at the same time is not possible.

Cas,

Unless you’re doing some strange packing or other pixel transferism, you should get good perf. Can you email me exactly what you’re doing?

Thanks -
Cass

Marcus,

The bottleneck is not the CPU. The bottleneck is the various pathways between devices in the system, i e the PCI bus (video capture), AGP bus (graphics card) and, most importantly, memory bus.

Cix,

When you got 17 fps, what kind of system was that on? I e, what busses were the cards using, what chip set and bios, and what memory? (See my previous post for what a reasonable target might be)

jw, I’m afraid I don’t know what BIOS is in the machine off the top of my head (it’s 130 miles away) except that it’s a brand new modern system in a swishy black case

Reading into AGP RAM from the GF3 was very fast but then of course incredibly slow to copy out to the Digisuite; reading into system RAM, the framerate plummets to 17fps. Reading directly into the Digisuite’s framebuffer gives us 17fps as well.

Also looking at the possiblity of finding the magic genlocking connectors in the ATI Radeon 8500 All In Wonder thing. Perhaps just feeding it an input signal will do? Then at least we can use dualhead and a pair of cheapo scan converters.

Cas

and while I’m here, where do I get the headers and extension specs for ATI’s drivers, anyone?

Cas

The ATI web site has a developer section. You can get public specs from there. To get actually useful drivers, you gotta be in the developer program, though; you can register on the site. I’ve found them to be friendly, helpful, and having a lot to do, so response latencies are high.