memory optimization

i’m displaying huge points clouds.
Let’s say i’m working with a cloud of 10.000.000 points.

Each vertex is 3 float (12bytes), so only the vertices take 120.000.000 bytes.

When i send normals, it comes up to 240.000.000 bytes.

I’m sending this data with VBO, but i’d like to optimize it to use fewer video memory.

An idea would be to encode the normal over two bytes and then to decode it in a vertex shader.

Do you have any comments about that or any other idea that could save me memory ?

Or if someone have already done some tests on bigger clouds (like 50.000.000 points), i welcome any idea.


Chunk up your set of points into multiple partitions. Render each partition separately. For each partition, have a full precision origin point and a scale. All the points in the partition could then be encoded using one byte per component (relative to the origin). If the precision is not enough, then subdivide, until the error is below some acceptable threshold.

Or, you can group all the points by normals. Then, you just set the normal vector, and render all the points using this normal, etc…

And, of course you can combine the two techniques.

thanks much for these ideas.

I like the idea of using a precise origin and a scale, and then coding each component on fewer bytes.
You said that if “precision” is not enough, one can subdivide again. But actual precision is the screen space error, so if the camera moves, do i need to recalculate all my coded points for satisfying the new position ?
Or can i assume that if the camera is not very close of my points, the world space error is a good estimate of the actual error ?
I could use the optimization when being far from the cloud. And when camera is close, i could clip the chunks that not visible at all and render visible ones with a traditionnal method.

For the normal grouping method, i think that make many chunks. If I have 10.000.000 points, and if I code my normal over two bytes, that create almost 65.000 chunks of 150 points. Is that efficient ? I might group points according to the first byte of my coded normal, thus getting 40000 in 256 chunks.

By the way, do you know any articles about these problems ?

Just subdivide until you find the sweet spot of performance/accuracy ratio. If you want, you could specialcase when the camera is very close, but keep in mind that floats have errors too, they don’t have infinite resolution. So it might not make sense to go below some threshold.

As for the normals, you could render groups of vertices, where the first byte of the normal is the same, and pass the second byte with each vertex. This way you’d only have 256 batches.

Unfortunately, I don’t know any papers on this subject. These techniques are not very scientific :slight_smile:

EDIT: I’ve just realized that you’ve said the same about normal grouping, hehe :smiley:

If you can sacrifice a bit of precision then you can scale your vertexes by some number (say the highest value or something) and express the resulting number as a percentage of that number (the one with which you scaled) like 1, 15 etc. That way you can store your vertexes in 3 shorts thereby reducing memory usage by half. This is particularly useful if your algorithm generates values along integral boundaries, in which case you can forget the scaling part and just keep those integral values in short types. I have learnt that most of the time, where you can sacrifice a little bit of floating precision, you can actually get away with integers. I would not recommend using bytes for normal storage since their performance is very bad on ATi and nVidia hardware (or that is what my tests have proven time and again).