Generating Normal Maps From 2D Images

As far as I’m aware, there’s only the treat image as a height map approach to generating normal maps.
Is anyone aware of any way to try to recover a more accurate normal map from an image? If it could be assumed that the image had simple lighting - say, a single directional light?
Each pixel in a grey scale image could be said to be the dot product of the surface normal at that point and the light vector, couldn’t it?
So, we know the dot product, and we know the light vector, how could we find the surface normal?

You can’t, cause you would have problems with symetrical situations. for example a light direction vector (1,0,0) and two surface normals (0,1,1) and (0,1,-1) would result in the same colorvalue.
So the only chance is to track contours or something like that. But then you need to make assumptions about the structure that the 2d map represents.


I can get as far as the ‘elevation’ of the surface normal, but the ‘bearing’ could be one of an infinite number. Even this artificial surface normal is giving more promising results than the ‘height map’ method (where you only get non-identity normals where 4 neighbouring pixels have different values).

So it’s down to “how do you find the bearing of a ‘surface’ at a point in a 2d image?”.
Tracing contours is one possibility - trace outwards from the pixel being considered until you find a wrinkle, and use the bearing vector of that wrinkle.
I’m brainstorming here…
I’m working with a picture of some rocks taken from above. If it works for that, then it will work for similar types of pictures.

This is a situation of going from more information to less information.
Kind of like saying, the result is 50, there were 2 numbers added, what are the 2 numbers?

The dot product is the same.

The only information you can get is the angle the surface makes with the light direction but a surface can have an infinite number of normals to satisfy the equation.

Look up “Shape From Shading” on Google.

It’s a pretty well understood field. You should be able to quickly determine the limits based on the amount of information that you have.

You might also want to look up “Normal Estimation”. I remember some research in psychology to determine how good people are at guessing normals. They might have some useful tools already implemented and available that would help you manually enter normals.

Originally posted by V-man:
This is a situation of going from more information to less information.
Kind of like saying, the result is 50, there were 2 numbers added, what are the 2 numbers?

Err, that’s the purpose of algebra, isn’t it?
solution: X=30
Granted, there are infinite solutions to this particular equation (infinite bearing vectors), but it’s not beyond recovery - the information is there, it’s getting it that’s the problem.

VikingCoder, thanks for the tips - I shall look into it.

To start with, the information isn’t there. In the process of a dot product you have a three/four component vector which is trasformed into a scalar. It’s like takin three apples with engravings on them and then making apple sauce out of them. Can you still retrieve the engravings?

Even if it would work you would have to do much more work to get the right results and because of the multiplications that occur in a dot product you’d be missing precision, anyway.

And for height maps one could create an editor that exports floating point images and not the fixed point 8-bit usually used. This enables high-precision heigtmaps.

I just can’t see why you would like to use a technique like what you brainstorm about, it seems quite uncomfortable to me :\ And even if the dot product aproach would work, it would be very hard to see what the result looks like when creating the textures. You could use a highly tessellated mesh and then make the map from it. But creating such a mesh is hard work and if you’re using laser scanning f.i. you’ll get height maps that should be even more accurate, assuming you are exporting floating point (32-bit perhaps) height maps or fixed point with atleast 16-bit precision per channel.


[This message has been edited by PixelDuck (edited 03-22-2003).]

Well I can clearly see the point of recieving normal maps from real images. You could get surfaces that look like photographed surfaces but respond realistically to light. I mean, look at the normalmaps created by artists, even though they manage to satisfy the phong lighting equation, they’re still infinitely far from realistic. In many cases the result looks worse (more plastic-like) than with only a diffuse texture. At least they know how to paint those…

Of course this problem isn’t limited to recieving the normal-maps, you need to retrieve the specular parameters and suff like that too. Altough this stuff theoretically isn’t there, I think the IBR guys somehow manage to find it anyway. Often you need several photos and controlled lighting conditions, though. Some related material can be found at


Cough What about laser scanned floating point height maps from sculpted models? And then using a 3x3 field to compute the surface normals, you get basically the same result, although perhaps with some softening due to the averaging of the normals, but anyways. You don’t have to paint the details if you don’t want to (assuming you have the right tools =)


Originally posted by PixelDuck:
Cough What about laser scanned floating point height maps from sculpted models? And then using a 3x3 field to compute the surface normals

Are you sure those laser scans are for height maps?
I’m pretty sure they capture the x,y,z and they have the laser system rotate around the sculpture.

I’ve seen laser scanned sculptures. And you can see them from all directions.

But if someone has written algorithm that does this well enough, then I’m interested.

The way I see it, laser scanning creates relational coorinates with relation to the cylindrical rotation “plane” (actually a curved continuum and the models are then simplified from those scans to create the actual models. Why couldn’t the height maps be created from the model with heigher resolution? That’s basically what I mean. The data that is found within the simplified polygons is verified and a height map is created from the geometry that is left inside this area, against the plane formed by the simplified triangle. Hmm… I have to say, I was a bit hasty when I said that laser scanning could easily create the data But I guess the method I mentioned MIGHT work, but I don’t know, I don’t have the tools to test it :\


Another problem is when you have the normalmap generated from a picture, removing the hardcoded shading on the picture.
It’s a 2 fold problem the more shading in the picture the better your normalmap will be, but your bumpmapping will look crap anyway with the hard shading baked in the diffuse map.


However, the biggest mistake the artists tend to make is to remove all baked lighting from the diffuse map, and after that the result looks really horrible. Remember, point is to look good, not to show off your fancy bumpmapping. A single normal map just can’t reproduce all lighting on a surface. You can remove all directional lighting, but special attention must be payed to the ambient lighting in order to get good results. But then, looking at the recent tenabrae shots, you (or your artists) already know that


I’ve read all I can understand (which isn’t much) about “shapes from shading”.
I’ve persued my own line of thinking:-

  • Find ridge normals using the standard heightmap technique (normals from differing heights of immediate neighbouring pixels).
  • Repeatedly pass over the resulting normal map, spreading the normals across flat areas. Sort of a modified box filter.
    I must say, this results in much better approximations of the actual surface than the standard method (which just gives you ridges).
    More experimentation…

I searched around the web and saw some of the results. Some people have algorithms that require manual help to get right (selecting regions to process) but that’s only if you have a scene. If you have a lonely object, the chances of getting it right …

I would have liked to try out some demo program but didnt find anything.

It’s an interesting thread and I would like to have something to play with.
knackered, can you upload?

As for bumpmapping: shouldn’t you remove the wrinkles from the diffuse? What do you need them for once you have a normal map?
If adding ambient is needed, then you can add another stage that does previous + ambient

You can always use an “ambient cubemap” to add more detail.

By ambient I mean light reflected from enviroment, not a constant color added to the lighting equation.

The idea is to modulate the diffuse (and gloss) map with an accessibility map. The accessibility map represents the amount of ambient light reaching the texel. So if you’d have say a hundred light rays coming from different directions, the accessibility value would be propotional to the number of rays that aren’t occluded by nearby geometry or the bumps in the bump map. So the bottoms of your wrinkles should be darker in the diffuse map. Some people use global illumination software to calculate the accessibility map, others just a modified version of the heightmap.

What is an ambient cubemap?


This starts to sound quite interresting

Ambient cubemap? I should guess it’s a cube map that represents the surrounding ambience from some direction. Actually the idea of ambience (as you said JustHanging) as used in real-time software isn’t good :\ I was actually planning on implementing some modification of the ambient occlusion method in ray tracing (that is, an implementation not in ray tracing), but I don’t have the time for it atm. I think I’ll look into that later on.


Good to see some more finnish people here

My ideas of improving realtime ambient lighting have been in lines of dynamically creating and moving fill lights based on some heuristics or even loosely on some global illumination model. Don’t know how that would look, but it might work, at least for typical scenes. Fill lights wouldn’t have shadows so you could have quite many.

Anyway, this is getting quite far from the original subject, sorry knackered. Make sure you show us some of your results!


You might want to make the abient term direction sensitive. The first way to do that is through an ambient cube map, using NORMAL_MAP texgen. However, at that point, you might as well bake ambient + diffuse into the cube map.

A nice way of getting a better solution for this is the spherical harmonics approach per-texel (google for that for some good references).

I’ve just been experimenting a little idea… ok, it doesn’t solve the normal map generation problem, i haven’t got that far but…

One of the problems is that the lighting information is embedded inside the texture. From a given texture we’d like to generate a diffuse map ( with no lighting information ) and a normal map.

For the first, my idea was to convert the RGB color to HSL and to set the luminance to a constant. Then convert it back to RGB and use that for the diffuse map. I tested it and it seems to work pretty well.

The diffuse map looks extremely ugly, but when modulated back with the luminance you almost get the original texture. So results should look good if i can find a way to generate the normal map from a luminance texture :slight_smile: