Branching in vertex shader

Hi!
I have GeForce 8400GS. I wrote vertex shader with ‘if’ that shows following code:


v = a + b + c;
if( variable < 1.0 )
{
v += d + e + f + g + h + i + j + k;
}

regardless of whether the condition “if” is true or not - the performance is always the same = 275FPS … if I comment out the body of “if” or simply throw out the piece of code - performance increases up to = 350FPS.
if it’s not working Where is THE branching I ask? :frowning:

The compiler is probably implementing the branch as a conditional assign, especially if ‘variable’ is actually a variable and not a uniform value. Implementing the branch this way will compute the new value of v, and only assign it to v if the condition is met. This is the approach used by many compilers for short clauses or target architectures where true branches are very expensive (per-value branching is generally very expensive in vectorized hardware like a GPU).

Right. If you want to google this, it’s called “predication”. Combine with “GPU” or “shader” to get the goodies.

Searching goodies in google I found materials like this:
http://arch.imu.edu.cn/student/zhangguan…%20Platform.pdf
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter34.html

but never get the GLSL code examples how to program the “if”. I compute some variable in vertex shader. I calculate the dependent variable from the vertex position, I calculate collision with a solid - cone. Everything is inside I want to render with better quality, everything is out - poorer quality but faster.

As already explained above, branching is costly, especially for older hardware. That is the way GPU works.
So for small computations, you will not see a gain.

My graphics card supports at least the ‘shader model 3’ so it seemed to me that attain the speed. The more that the computational cost codes inside the body of ‘if’ is in my opinion high. Just as I have mentioned the difference is between 275 and 350 FPS. (I’m sorry for my english).
So, is there any way I could in a real way to speed up operation of the vertex shader?

The more that the computational cost codes inside the body of ‘if’ is in my opinion high. Just as I have mentioned the difference is between 275 and 350 FPS. (I’m sorry for my english).

First, the difference between 275fps and 350fps is less than 1 millisecond per frame. It is technically about 1/3rd of the frame time, but in absolute numbers, it is rather small. It only looks like a large difference because you are running at very high framerates. Actually millisecond numbers are much more accurate predictors of performance.

Second, precisely how much do you know about the internal workings of shaders?

I’m not being flippant; shader compilers are written by people who have every incentive to make shaders run as fast as possible. They know the details of the shader hardware far better than any of us. Therefore, if a particular shader is not running as fast as you think they ought to, there’s a good chance that your expectations are what is at fault rather than the shader optimizer.

Now this isn’t guaranteed to be true in all cases. But:

So, is there any way I could in a real way to speed up operation of the vertex shader?

Even if the optimizer is wrong, there’s not much you can do. The optimizer is going to have the final say in your shader code regardless. There’s a chance that if you switch to assembly shaders that you’ll get a different optimizer. But unless you’re using non-cross-platform extensions, assembly shaders don’t even have looping (if I recall correctly).

The best you could do is to draw some objects with the complex shader and some with the simple shader.

Second, precisely how much do you know about the internal workings of shaders?

I still learn about the work of the GPU.

But I know that if I move the calculation to the fragment shader then I get the acceleration by reducing the amount of code executed by the “if” (speed from 230 to 330 FPS). But it is a bit bad solution, because a lot of values are interpolated from the vertex shader to fragment shader (50 floats), so the efficiency decreases.

The best you could do is to draw some objects with the complex shader and some with the simple shader.

This is a solution which I thought AT THE BEGINNING ;] - just because of the parallel characteristics of the work of the GPU. But I really need the ability to render an object with higher quality locally and around this part of the object render poorer quality but faster - for research purposes, purely experimental.

And what about the new nvidia geforce gtx 480 - 460 with fermi architecture?
I have access to the card.

But I really need the ability to render an object with higher quality locally and around this part of the object render poorer quality but faster

Thus far, you have described your problem as being about the performance of the solution. That is, you think it should be faster than it turns out to be. You haven’t said anything about the results of using the if statement.

Are you saying that, with the if statement there, the image or data you get back is wrong? Or are you just saying that it isn’t running as fast as you think it should be?

In vertex shader: I see no difference in performance between the code executed inside “if” and the code executed inside “else” - these codes have a different cost.
In fragment shader: I see difference in performance.

In vertex shader: I see no difference in performance between the code executed inside “if” and the code executed inside “else” - these codes have a different cost.
In fragment shader: I see difference in performance.

Right. But the GLSL compiler doesn’t guarantee that any particular if statement will be an actual conditional branch. If the vertex shader optimizer decides to execute the statement anyway, there’s not much you can do.

In short, you cannot rely on an if statement for performance savings.

In short, you cannot rely on an if statement for performance savings.

This is a concrete answer, thanks.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.