Point-lights shadow mapping

I’d like to gather some experience about simulating point-lights with shadow mapping.

I’m working on a tech demo with a hi-polycount castle that contains a lot of corridors and small point-lights; up to a few hundred in the level. All potentially dynamic, but for a given frame, most likely all static. I will implement a pretty smart caching system for shadow maps and adaptative resolution depending on the view distance, but that’s not my problem at the moment.

All these lights will be point/omni lights with a small radius - generally a few meters (think torches/candles).

Generating the shadow maps (whatever technique) is no problem as the results will be cached. So i’m only concerned with the rendering speed concerning the “rendering” pass with projective shadows.

I’ve already implemented it using six 90° spotlights perfectly matching, and it works - but i “only” get 50 fps on a 30k polys room on a Radeon 9700. Fault is to 6 rendering passes (one pass per spot light), which multiplies the actual “rendered” polycount to 150k (with light frustum culling enabled) - and i will have a polycount much higher in my castle.

Therefore i need a way to reduce my rendering pass to once per point light and not 6.

I’ve seen Humus has implemented it nicely in his “shadows that rock” demo, and i think i will probably go that way. He’s using a cube map and doing the calculations in a vertex and a fragment program, but i’ve been less than impressed by the performance of his demo on my Radeon 9700, around 30 fps. His scenes are more than simple, but he’s definately fill rate limited. He’s using 2 point lights with shadow maps updated every frame, which will be cached in my system…

Does anybody know of a better method to do point lights shadow mapping ?


Have you considered stenciled shadows? I know the knee-jerk reaction might be one of skepticism and incredulity; but depending on how aggressive your visibility scheme is, this might be comparable to maps. All the static volumes in the world can be precomputed, and stored in the scene-graph with everything else. That is not to say the torches can’t flicker or change color, they just won’t be able to move much. Some careful preprocessing could make this very doable.

It might be interesting to implement both methods, and compare. I’m currently in the process of implementing the SSV technique, and so far, so good. It fits very nicely with my lighting model. Granted, it’s a lot of geometry, but it looks marvelous.

Hum no, that would be insane… i have a lot of geometry. My castle is made of individual bricks :slight_smile:

I display an average of 150k polys per frame - after backface, frustum and occlusion culling. That’s for a single pass.



I’m using dual paraboloid shadow maps for omni lights, but it’s hard to say anything exact about the performance, as I don’t really have anything to compare it to. The potential win, compared to cubemaps, is that since you can use depth textures, you might save some fillrate compared to encoding depth in color channels. If your geometry is as dense as you say, tesselation isn’t going to be a problem. You’ll have to use vertex programs, though.

I’ve also found that 8-bit shadow maps are sufficent for lights with a small radius. If you use them, you can just store the depth in a regular cubemap’s alpha channel, and get rid of all the fragment and vertex programs.


JustHanging, thanks for the advice but i have my doubts about DPSMs. I don’t think tesselation would be a problem, but as i understand it, it’d require a pretty complex vertex shader to do the projection, plus rendering would still require 2 pass.

Here is an example of a simple scene: Castle room

That’s for a single room and there will be like 10-20 of these rooms displayed on screen. There’s no texturing nor per-pixel lighting in this screenshot, it’s all geometry.

On the alpha channel trick: how do you perform the depth comparison without a pixel shader? i’m looking for an ARB solution only.


My castle is made of individual bricks
Why? That’s entirely unnecessary. Why not just use an incredibly high-res texture (re: 2048x2048 or better), coupled with some detail texturing? That produces pretty results that are virtually identical to yours, but without the added rendering cost.

Because that’s specifically the theme of my demo: high polycounts. I know it can be done with per-pixel lighting and good textures, but that has been done 100 times. I’m aiming at a different thing here… :slight_smile:


If the lights have such a small radius you could probably manage with 8 bits for depth which would mean you could put all the data for a single light in a luminance cube map. That would only require a single texture lookup (plus comparison instructions) per light for the rendering pass. If you’re not doing any other fancy fragment shader stuff you could probably fit lots of lights in the fragment shader which means you would need less passes and thus submit less geometry.

For a castle type scene, I would guess occlusion culling could be very effective (if you’re triangle limited like in this case) but that seems somewhat OT for this thread and would be slightly more complex to implement.

You could pre-render the shadow map for all lights statically, and only change it if any object within the light radius changes (or the light moves more than X).

Edit: you should only re-calculate the maps for lights that need re-calc AND will affect the rendered scene. But you already knew that :slight_smile: Good occlusion will save you here.

All of that (occlusion culling, shadow maps caching) is done, as stated in my original post. So i’m only looking for advice to perform the rendering pass with point light shadow maps quickly. But thanks for the advice :slight_smile:


I perform the paraboloid mapping on the cpu and cache the results for visible lights, so I can render them without vertex programs. I also clip the results at the plane where the paraboloid maps meet, so that I don’t have to draw the same geometry twice. This causes a slowdown for moving lights, but greatly improves the average case. It might not work high-poly scenes, though.

You can perform an 8-bit depth comparision by subtracting lightspace depth from the shadowmap (in alpha) and alpha testing out fragments with zero alpha. With omnidirectional lights you should use radial distance, which can be computed as a combination of a 2d and 1d texture (or one 3d, propably slower). Unfortunately 8-bit maps might not be sufficent for thin features like that candle holder.


Originally posted by Ysaneya:
JustHanging, thanks for the advice but i have my doubts about DPSMs. I don’t think tesselation would be a problem, but as i understand it, it’d require a pretty complex vertex shader to do the projection, plus rendering would still require 2 pass.
Unless you can come up with a 360-degree FOV projection, you’re always going to need at least two passes.

If you have mostly static lights (candles and torches as you say) and it’s mostly the scenery that’s moving (e.g. doors swinging open, characters walking around), you’re probably better off with a cube map than with a DPSM. You only need to update the cube faces that can “see” the moving objects in this case.

One other thing. You’ve probably already done this, but you haven’t mentioned it here so I’ll point it out just to make sure: you can cull away all geometry outside the light’s bounding sphere.

– Tom

you can cull away all geometry outside the light’s bounding sphere.

Yup, that’s the first and most obvious optimization to do.

However i don’t see why you always require two passes to do the rendering per light - but actually words are becoming a bit confusing so here’s what i understand:

Assuming all shadow maps are up-to-date and ready to be used:

  • you need one pass to render the scene to the ZBuffer and/or ambient lighting.
  • then for each light:
    • 6 pass for the 6 spotlights trick (that’s what i’m using now)
    • or 2 pass for DPSMs (using 2 shadow maps, one per hemisphere)
    • or 1 pass for a cube-map shadow map - but that requires that you do all the maths in the pixel shader


Ah sorry, when you said six passes I assumed you were talking about the cost of generating the shadow map, not the cost of rendering the lit and shadowed scene. You should always be able to do that with a single pass per light regardless of which approach you use, by binding all required shadow maps simultaneously.

Of course binding six shadow maps simultaneously may be stretching it a bit, so in that case you’d probably be better off with a single cube map.

I guess it depends on how good your culling is when using the six spotlights trick. Theoretically, that technique doesn’t “multipass” anything – it affects different geometry for each of the six lights. If your culling is aggressive and fast enough to avoid significant overdraw in this scenario (without becoming a bottleneck itself), the cheaper pixel shader could make it beat the cube map approach.

– Tom

Hum that is true (i can bind the 6 shadow maps in a single pass), but consider this:

  • you also need to use one clip map per spot light in order to get ride of the back projection problem
  • anti-aliasing your shadow maps (PCF) for a better quality requires that you sample the shadow map N times…

I don’t think i will have a lot of texture units left to do all of that. I’m not even talking about per-pixel lighting and fancy effects…

My culling is quite efficient but is only done at the object level, and objects can be quite complex. For example that candle holder in my screenshot is around 20k polys, and is seen by 2, if not 3 spotlights. Overdraw is generally quite low as occlusion and backface culling really helps here.


You’d only need one texture to kill the backprojections, although you would need to read it six times. Better yet, you could use KIL and skip the texture reads altogether. You wouldn’t burn even half of your sixteen available texture units :slight_smile:

I agree though that all this would make for a damn expensive shader. It’s bound to be a lot more expensive than using a cube map.

But this is why I pointed out that the six spotlights technique isn’t technically a multipass technique at all. If your culling can separate the geometry seen by the six spotlights well enough that the overlap becomes negligible, you can do each spotlight using a single shadow map and only render everything once, using a cheap, straightforward pixel shader. Whether or not this is faster than using a cube map depends on how good your culling is. If it’s done with a granularity of 20K polys as you say, I wouldn’t expect much of it.

– Tom

I haven’t used cubemabs for shadowing at all, but even so here’s my contribution:

Texture killing on the card that I’m using is very slow. Using saturate works wonders, but breaks when you get too close (the coefficient is between 0 ), but that’s the fastest thing that I’ve heard of for backprojection (basic comparison to zero also seems to be slow). The problem with close stuff can be fixed somewhat by multiplying the output by a big number (x8 can be built in to instructions I think…).

As for other stuff, could you subtract the depth recorded in a single floating point cubemap and then kill based on that (or perhaps use the saturate technique?). It’s true that it wouldn’t take advantage of shadow mapping hardware, but I don’t think it would be too bad. Wait… no… if you want bilinear filtering you’d have to do that yourself… ok that would be too expensive, nevermind.

Does any one know why cubemaps cannot be included in shadow maps? Seems kind of like a silly limitation to me…