Originally posted by jwatte:
If it’s like a texture access, it’s not guaranteed to just read the pixel you’re currently writing. You may want to read other pixels out of the frame buffer than the one you’re going to write. Thus, you have a generalized read-after-write hazard.
If you allow reading of the frame buffer pixel that your fragment contributes to, you still have the problem of multisampling.
If you only allow reading the corresponding framebuffer color value for exactly the fragment you’re processing (I don’t know if that’s possible in all antialiasing applications, but let’s pretend it is) then you still have inefficiency, because framebuffers typically live in DRAM-like memories, who do not like being turned around between reading and writing, and may add many cycles of latency for each time you change the direction.
Which leads me up to my original speculation that you would have to render in some tiled fashion, keeping the entire tile (be it horizontal, vertical, or square in some close, fast SRAM that has better access characteristics. And, evenso, reads from other parts of the framebuffer would still be less efficient (thus, my comment about locality). Although, because you’re tiling, you know which parts can be cached, and which parts need hazard resolution, and you’d probably serialize switching between destination tiles to make that not a problem.
never talked about general texture accessing. i only ment the fragment right under the new fragment, like in blending.
and yes, there is the read-write task to do. just like in blending. you get the idea? it does not have to do much more as in blending, so where’s the real point? yes, blending IS slower, but it does not “kill” performance. it does hurt performance, but the additional features blending gave us are worth that little speed drop. i don’t see why it should gave any more speeddrop for reading,writing and all the stuff. the only difference is you have to read more early => if you have several fragments at the same time at the same place, you get a stall. but somehow i just don’t think that has to happen very often, with a little bit of clever rendering. at least, i know the radeon does tile based render, in 8x8 or 16x16 chunks, or so. the first drivers had some bugs in the fragment programs, wich made such tiles visible. and i think, if you render such a tile at once, well, a line of the tile at once in all 8 pipelines, and all 8 pixels behind eachother due the pipelining, you should not get stalls. tilebased is the future anyways, we know that yet. hairdryer style cooling isn’t cool. it just shows lack of bether designed hw…