Case statements vs if else in GLSL shaders

mdenning · April 28, 2024, 10:37pm

Hi, do case statements offer any performance benefit over if/else statements in glsl shaders? I am currently working on a 3d render and I am trying to see how efficient I can get it to run. One way I am trying to do this is too reduce the number of draw calls by batching together drawing information. As I understand it, a draw call only applies to a single shader. Thus if I have 3d objects that use different shaders, I would need 3 separate draw calls. One way around this would be to combine the shaders into 1 large shader (I think they call this an uber shader). However, I read the branching with if/else statements can be expensive and might not be worth it in glsl code. Glsl code seems to have switch statements as an option for branching. As I understand it, switch statements are much more efficient than if/else statements (or at least in languages like C), however I am not sure if that is the case in glsl. Does anyone know that answer to this? Or would it be hardware dependent?

Dark_Photon · April 29, 2024, 1:27pm

So you really have two implied questions here:

Is branching in shaders expensive / bad?
Is branching via switch statement any better/worse than with if/else statements?

Perf is always going to depend on the implementation (graphics driver in this case). That said…

The answer is you should profile and see, on your graphics driver on your hardware. Don’t trust the stuff you read on the internet (particularly on this subject) as much of it is old info or disinformation repeated gossip-style by folks that have never actually tested this stuff and don’t know.

For #1, often no. It depends on how you use branching. If you’re branching on a constant expression, it’s typically free as the GPU will never even see the branch, or the expression upon which the branch was based. If you’re branching on a uniform expression it can be very cheap (the generated GPU code may not even use branching but predication instead), but you need to consider the cost of the additional code+registers you’re roping into your shader binary and the potential loss of occupancy (less parallelism to hide memory access latency). If you’re branching based on thread-divergent criteria you need to be more careful, and still consider the occupancy issue.

For #2, I’ve never read anything calling out a big difference between the two. Either can be completely discarded (constant expression), or (if not) evaluated using predication or explicit conditional branching depending on what the graphics driver feels is most efficient in this case.

GPU and CPU performance is much more complex than just considering the per-statement cost of operations. You have to consider the surrounding code and amount of parallelism permitted by that code as well.

Here are some related links on this topic you might be interested in:

https://twitter.com/andreintg/status/1261889662542581760
https://twitter.com/bgolus/status/1235254923819802626
https://twitter.com/iquilezles/status/12352742947765657600
https://tangentvector.wordpress.com/2013/04/12/a-digression-on-divergence/
https://bartwronski.com/2021/01/18/is-this-a-branch/
https://medium.com/@jasonbooth_86226/branching-on-a-gpu-18bfc83694f2
https://twitter.com/SebAaltonen/status/1635003231037452289
https://twitter.com/tom_forsyth/status/1634988037418676224

mdenning · April 29, 2024, 5:26pm

Thanks for the response and the resources. I had assumed that I would need to do some work of if/else check on every point I drew in order to have branching. It looks like it isn’t that straight forward lol. I will need to read up on it more. Thanks again for all the resources.

system · October 29, 2024, 5:27pm

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.