DOOM III screenshots

What does cpu intensiveness of shadow volumes have to do with portal based visibility? Unless if you mean that you can easily cull away nonseen shadows which is about the best optimization i think.

No, not cull away “non seen shadows” but not create shadows for “non seen surfaces” ( except if the shadows would cast into the view, this is determined before creating the shadows ) in the first place. This can be done with other visibility determination techniques but with portals you have a point-to-area visibility where as with something like PVS you have leaf-to-leaf. I use a BSP tree for preprocessing and a “special” one for the realtime part.

I have two levels of optimization, one is surface local ( SSE ) and one is global ( portals ). The global gives you the benefit of extracting the silhouette from a given set of surfaces, while the local gives you the benefit of optimized SSE ( so I get the best of both worlds ). The optimizations also depend on the current viewpoint and the light position relative to the viewer.

In addition, you’ll be able to reduce fill consumption far easier and to a higher degree with portals ( this is quite logical but I’ll the details up to you ).
Actually, I plan on tessellating big planar surfaces a bit more which should reduce the fill consumption even more.

Edit: a few more additions

[This message has been edited by PH (edited 10-19-2002).]

Suppose you insert degenerate sliver quads between all triangles, and move verts based on whether their faces are facing the light or not, then there’s no CPU work involved. The fill rate should be the same as for “optimized” meshes. Thus, what you’re paying is vertex transform.

Luckily, this vertex format is only for shadow volumes, and thus needs no U/V information, tangent basis, or anything like that. You can go down to 12 bytes per vert. Which is lucky, because you’ll see something like a 4x increase in vert count :slight_smile: Again, luckily, the vertex shader can be fairly short, as all you need to do is get into perspective view, dot normal with light (in perspective view again) and move the vert if it’s negative (maybe even proportional to amount of negativity). Call it twelve instructions (I haven’t counted). 400 MHz / 12 means 33 million verts per second per vertex pipe – you won’t be transform bound. 2 Gb/s / 12 b/vert means, what, 166 million verts per second through AGP 8x? You won’t be transfer bound, either.

Honestly, if you’re targeting a shader card (which I hear that DOOM is NOT) then I think all this beam tree stuff to efficiently generate shadow volumes on the CPU is just not necessary.

Toss in two-sided stencil, scissoring of your volume projected to the screen, and maybe the infinite far plane trick (don’t know if it’s actually necessary) and you should have a reasonable stencil shadow renderer in two weeks or less :slight_smile: (That includes automatically processing the geometry in the exporter) This engine should give you “exact” results just like the CPU based beam trees, as opposed to the “re-use the original mesh and pull verts based on vert normals” trick – on the other hand, that latter trick will have less visible artifacts where vert normals don’t concide with face normals and diffuse lighting gets masked off too sharply by the stencil.

The portal method I briefly outlined, is not related to beam trees. The BSP tree I use is only for implementing efficient point location queries, so you know what area the viewer/lights are in ( and thus which portals to consider ). That part could probably be replaced with something else. The point-to-area visibility gives you a minimal set of surfaces from a given point. You should still be able to use the GPU approach on those ( I have never implemented GPU-based stencil shadows but I definitely think it’s a good approach ).

I agree; I was commenting on the original approach taken in Doom III (according to those public interviews).

Anyway, any modern engine should (and probably does) draw depth buffer first (maybe even with color writes disabled). That way, it doesn’t matter (as much) whether you get the “optimal” set of surfaces when you go back to re-render for lighting, although if you can get more culling cheap, that’s always good.


The infinite far plane trick you mention, do you mean the Pinf matrix ? If so, you can you NV_depth_clamp instead ( GF3/4 only so far ). I still haven’t tried it even though it’s a small code change/addition.

I forgot to mention, I use a single connectivity mesh for each entity ( the main world being one ), which is then pruned ( based on which surfaces are visible ) before a silhouette is extracted. I do this for efficiency reasons. I tried to consider each surface in isolation but the fill rate consumed was a bit too high. I might as well have a look at GPU shadows now and see if I can fit it in .

Originally posted by jwatte:

Honestly, if you’re targeting a shader card (which I hear that DOOM is NOT) then I think all this beam tree stuff to efficiently generate shadow volumes on the CPU is just not necessary.

Just a problem if you do vertex shader shadow volume (like nvidia demo), you always draw front and back caps, that are not necessary, so it cost fillrate.

Furthermore, for complex scenes with many shadows, beam trees are very useful to save fillrate.
For exemple, you can have a shadow volume that is totally enclosed by a bigger one.
With beam tree, you can detect this and don’t draw the enclosed volume.


With scissor buffer, you save on the fill rate if the caps aren’t necessary.

By sorting near-to-far from the light, you can stencil test your stencil shadows, and thus avoid fill write cost for things completely in shadow.

Sure, you CAN spend all that CPU and optimize out the last drops of blood. But it seems to me that you’ll get within spitting distance by just trusting the GPU, assuming you encourage it to do the right thing by stroking it just so.

By sorting near-to-far from the light, you can stencil test your stencil shadows, and thus avoid fill write cost for things completely in shadow.

That’s a very interesting idea. Have you tried it out ? That would really work as far as I can see…brilliant idea actually.
So you can save the stencil write but how much does that affect performace ? Are the reads not the most expensive ?

Yeah that’s what I meant I guess I just said it funny. I too am using portals to decrease the amount off objects and shadows I have to deal with. Instead of having a specific world mesh we just treat everything as regular meshes and build the world out of many vertex array meshes. These can have there own texture etc. This way we save on redundant meshes. For our determination of which sector the lights/camera is in I just save where it was last and see if it passed through a portal thus I don’t have a bsp of any kind. I currently implementing the visibility determination of objects, lights and shadows so I cant really comment on how well it works. Because everything is a mesh there’s no need to have different ways to generate shadow volumes. Your global shadows your saying to find which surfaces of the whole world you need to deal with and just generate the shadows from those? Isn’t that kind of excessive? If your world is highly detailed or highly tessellated you’d basically be determining the visibility of every triangle right? Or maybe you group them together to help this?

I read somewhere maybe his plan or a interview, don’t remember where, that carmack had trying vertex shader shadows and found they gave him almost identical performance. He said that he’s staying with cpu made volumes because of a optimization they can do in certain cases. I could only guess this might be no caps for the shadows the camera is definitely not in. After I read that I decided not to bother testing them if they did the same as what I already coded but now you’ve changed my mind. I’ll give them a try.

The talk about beamtrees in doom doesn’t really have to do with silhouette extraction. My understanding is that he’s using that only for static shadows+static lights. This really means he does the extraction preprocessed and optimizes the volume with beamtrees. Of coarse you still want to know if you have to draw it or not regardless of whether its static or dynamic.

I don’t quite understand what you mean by stencil test the stencil volumes. How do you do this? Could you possibly give a basic run down of the algorithm?


Instead of having a specific world mesh we just treat everything as regular meshes

That’s not such a good idea, who’s “we” by the way ( just curious ). Here’s why it’s not a good idea:

I treat all surfaces identicle ( they’re really just meshes, each capable of having its own shader ) but if I extract the silhouette on a per-surface basis, more fill is used. Imagine a long corridor with several wall sections. These are logically connected and if you extract the silhouette from them as a group, the generated volumes will be more efficient. If you extract the silhouette for each wall section by only looking at the surface in isolation, you get extruded polygons between the walls ( very inefficient ). I also noticed a few artifacts in certain cases when doing that, pixels that were “double tested”.
I don’t have an explicit world mesh, but an explicit per-entity connectivity mesh just for shadows. Rendering the surfaces is done with entirely different data ( with matching vertex position of course ).

Regarding beam trees: I still think it’s a very good optimization for static surface/light pairs. It’s a lot of work to create beam trees robustly ( mine isn’t 100% robust at the moment but it generates a 100% efficient volumes ).

The trick that jwatte mentions would require treating surfaces in isolation ( as I see it ), so I’m not certain it will be a win. It might not even work ( you can’t efficiently sort a deformable mesh, so I can’t see how the sorting would work ).

Anyway, that is my current approach which has been the most effective one I have tried so far. I have some other things implement as my approach is not optimal ( just an additional level of culling ).


hehe yeah I kind of exchange I and we. Its me and I think his handle here is Mazer. I am working on the shadows and visibility system so sometimes I use I.

Anyways, you have a very good point. I thought before to disconnect the rendering data from the connectivity data but I couldn’t figure out what I’d do for the visibility. How do you treat the visibility of a continuous surface? I currently treat each mesh as one object with a bounding volume and test if it’s in whatever frustum I’m testing. What exactly do you use if everything is connected for world data? You also mention a per entity shadow connectivity. Since for the world geometry would all be connected and closed what constitutes an entity? I suppose you could use the same per object visibility and just test those tri’s for facing but use the big connected structure to make the volume. I’d really have to think about it more. btw, I have seen some of those artifacts but I thought they were more because our data was bad. I don’t have the vis system completely coded yet. A big parts still in my head It’s hard to judge right now what’s an algorithm problem and what’s just a bug.

And just out of curiosity do you use the vertex shader shadows or cpu?

[This message has been edited by zeroprey (edited 10-20-2002).]

Actually, I think that depth-from-the-light thing won’t work right, or at least any more efficiently than just rendering the volume for each mesh. The idea was to work with different bits of the stencil for different depths, but you run out too quickly. Or you could “bake” the volume into the highest bit once you’re done with it for one mesh, but that requires another full-screen fill. So forget I said anything about that :slight_smile:

Anyway, vertex shader volume extrusion (without color or depth writes) after first painting depth (without color writes) with appropriate scissoring really ought to get you far enough, if you’re into the stencil shadow thing. I’m still partial to shadow maps, though.

My question is : what were this engine if he used perspective shadow mapping, wich is, un moste case, a good method in indoor scene where mosts of lights are spots ?
The advantage of perspective shadow mapping is to avoid the shadow volume computation and rendering, to adapt the depth map definition to the power of the GPU while preseving a very good shadow edge quality, and to use high definition model while the technic of bump mapped details is avaible too…
What do you think of that ?



This link doesnt work!

Arrghh! I want to see these screenshots too!

Maybe he got whacked on copyright. They were scanned for a magazine.

Some particularly good high res versions of the available shots can be found here:

BTW I’d guess the soldier and scientist are high resolution meshes because this is the intro sequence with a controlled load and they’re showing off with slow controlled closeups, there appear to be fewer polys in the ingame monsters, but it’s difficult to judge accurately.

[This message has been edited by dorbie (edited 10-21-2002).]

Let’s see how long it takes until these are whacked then:


Perspective shadow maps do not work with 2D shadow maps if the light is in the view frustum. Unless you can use cube map, it is best seen as the solution for shadows cast by directional lights.

For this reason I want floating point cube maps!