Handling lots of translucent 2d sprites

I have an engine that’s quite mature ATM, but have found I have a really bothersome bottleneck in my fragment shader

if (texture(sampler1, vTexCoo).w == 0){discard;}

It achieves wanted results, but after reading about it and running my app without it, I now understand that it’s an expense I can’t afford.

The scene is rendered in layers, from front to back with an incremental stencil value that is used to cull obscured pixels. I thought this was a good thing, but now I realize I’d probably have the same performance rendering it all unstenciled back to front.

My theory is that even if I have this “layer” approach, the rendering process still can’t make clever stencil-based optimizations, since it can’t be sure that the fragment “below” it will in fact write to the stencil or not.

The solution I’ve read all advocate to separate translucent sprites from opaque and render them separately with different functions and shaders. This won’t do it for me as I’d say 70% of all my sprites are translucent.

There must be another solution out there. I’m thinking, since I have this "layer approach already up and running and since each layer very seldom have overlapping sprites, would it be possible to “reinforce” this layered concept to the pipeline. Perhaps calling a glFlush or a glFinish between each layer so that the next layer doesn’t have to worry about stencil bits being written or not, they always will be? But that will probably stall the process even more.

Any ideas?

EDIT: Maybe rendering each layer to a separate FBO and finally render these on top of each other? Could save me a couple of hours hearing your opinion on this.

I’ve now tested pre-analyzing all the sprites and, trimmed away all the excess alpha, found the fully opaque ones and rendered them separately without the discard statement. Got a small boost, but not enough to justify anything. I then proceeded to finding the biggest opaque square in each sprite and rendered that without discard. This resulted in roughly twice the primitives, but no visible boost.

I guess I’ll just have to suck it up. It’s annoying, since logically I should be able to get that sweet early depth optimization as everything is sorted.

Another idea I have is creating a 1 bit texture (alpha on/off) if that’s possible, and render all sprites using that first with discard and depth tests/writes. Then the actual sprites with only test. Perhaps this will allow the driver to do something clever, but I doubt it.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.