Fixed pipeline vs programable pipeline

Hello,

I’ve been fiddling around with profile/pipeline mixing to squeeze out some good results when no shaders are needed (ex, text rendering, placeholders, 2d HUD content…) and wanted to check if, performance-wise is Fixed pipeline better than programable pipeline (with simple, texture only shaders).

I’ve searched around and found this great article that performs the comparison tests and has a good conclusion, but it’s from 2004 and there have been many changes since.

Is using fixed pipeline better in terms of rasterization and stage bypass? Meaning, is fixed pipeline optimized to bypass some stages and/or has some optimized rasterization that the programable pipeline hasn’t?

I have also done some tests but wanted to know whats inside the box, if the process uses the same stage-path as the programable one, or has a special internal path to go directly to what it needs to do.

Thanks

This is 2021. There is no hardware fixed-functionality in those stages anymore. It’s all shaders. Whether written by you or the driver, it’s all going to do the same thing.

The only question is whether the driver might be able to write more efficient code or upload data more efficiently. While there are isolated cases of specific implementations offering specific benefits when using certain aspects of compatibility GL (and even those isolated cases are pretty marginal), on the whole, the answer is no.

Particularly with modern buffer streaming techniques, there’s very little that can be achieved performance-wise with compatibility GL compared to core.

And if you’re unsure about that, understand that whole new APIs have come out since then, and they have even fewer features like compatibility OpenGL. That should tell you how good those features are for performance.

On a modern driver the fixed pipeline is going to be emulated within the driver by shaders. Worst case is that if you set up a combination of fixed pipeline states, the driver is going to have to generate a new shader on the fly to emulate these states, compile and link that shader, and then push you through the programmable pipeline anyway.

You’ll also have to deal with the overhead of toggling between the two pipelines in your own program, and any state change tracking and filtering you might be doing is almost guaranteed to explode nicely in your face.

Moreover, the kind of use case you’re describing is probably the simplest possible: basic transforms, a single texture, maybe modulate by a colour and that’s it. Also that this use case is never going to be a bottleneck in any program (unless you’re doing something seriously wrong).

So all told I would suggest just using programmable for everything, from the perspective of keeping your own code paths simpler and more uniform, if nothing else.

2022?

When not even using compatibility profile? So the driver has to follow the same set of rules that a custom pipeline has, right?

That was exactly my question, but are you saying that fixed pipleine also uses Fragment shading as well, the texture is sampled pixel by pixel? or is it more turned towards the vertex shader in order to save up some operations?

Well, there are some things from which you can’t escape, the trick is how many times you do it, since they are for the HUD, they can be done only once.

I was more intrigued regarding the sampling part rather than the geomtry part, because that is the same everywhere either using custom or fixed pipeline.

You are telling me that using the classical glColor3f(...) and whatnot isn’t better anymore than color = vec3(...)

But glColor3f and friends aren’t the fixed pipeline.

That’s a common misunderstanding so I need to check with you before going any further - you are aware that you can use glColor3f and friends with shaders, yes?

… what? Textures were always “sampled pixel by pixel”. That’s how it has worked since we started having rasterizers. Fixed function or shader based, the rasterizer is still interpolating per-vertex values and using each interpolated value for a fragment to sample from a texture. The hardware is doing the same thing regardless of how you told it to do the operation.

isn’t it common practice to send the color thorugh a uniform?
regarding this article, here says the follwoing:
“Here we use the glBegin/glEnd pair along with glVertex and glColor to specify a triangle in clip-space that results in the famous “color triangle” that we all know and love. In modern OpenGL these functions are missing, so what can we do?”
If you are referring to the compatibility profile it is still a programmable pipeline, which is oposed to what I was asking

When you say friends are you referring to glMaterial and glLight…? because these are fixed

Not if it’s per-vertex.

Note that in the compatibility profile, you can read the legacy vertex attributes through the variables:

in vec4 gl_Color;
in vec4 gl_SecondaryColor;
in vec3 gl_Normal;
in vec4 gl_Vertex;
in vec4 gl_MultiTexCoord0;
in vec4 gl_MultiTexCoord1;
in vec4 gl_MultiTexCoord2;
in vec4 gl_MultiTexCoord3;
in vec4 gl_MultiTexCoord4;
in vec4 gl_MultiTexCoord5;
in vec4 gl_MultiTexCoord6;
in vec4 gl_MultiTexCoord7;
in float gl_FogCoord;

These contain the values which would be used if no vertex shader was present, i.e. those set with glColor, glNormal, glVertex, glTexCoord (or glMultiTexCoord), glFogCoord, or with the equivalent *Pointer functions when the appropriate array (GL_VERTEX_ARRAY etc) is enabled.

This allows you to replace the vertex part of the fixed-function pipeline with a shader without having to change the client code. Similarly, the matrices, lighting, material and fog settings, clip planes, and texture coordinate generation (glTexGen) parameters are available via uniforms. See section 7 of the GLSL specification for a complete list.

Obviously everything needs to be rasterized, if I’m being clear here, I’m not questioning that. My curiosity, and let me be clearer here, is if older hardware/Driver/Fixed Pipeline of days or yore/nonprogrammable Blackbox used a different method of plotting the visual data faster by any different methods than sampling it each individual fragment one at a time but instead draw/sample/rasterize/plot in the same elements in chunks (assuming a threaded GPU obviously)

Yes, but this would assume a programmable pipeline as oposed to the fixed (as in no custom shaders at all) on which was what I initially asked.

I think the idea here got mixed up, mostly because of the profiles vs pipelines.
Of course the glColor can be accessed in compatibility profile, but I was trying to leave that out of the way and going core, hence thecolor = vec3(...) rather than the GlFragColor = ...

So let me get back on track here:

Fixed pipeline (no custom shaders used at all, just plain OpenGL API being called by the CPU):

  • New hardware lets the driver implement its shader behind the curtains
  • These implementations follow the exact same rules as any other shader would
  • There are no advantages on using this

There’s also a bunch of built-in uniforms in downlevel GLSL versions that give you access to the old matrix stack, glLight, glMaterial, glFog, etc stuff. Just grab an older GLSL spec (or an older copy of the Orange Book from your local 2nd hand store), set your #version accordingly, and you can use them.

But that’s beside the point. The point is that people often have a misunderstanding that shaders require vertex attrib arrays require buffer objects, and that if you want to use one you need to use all of them. That’s not actually the case at all, and you can freely use buffer objects without shaders, or shaders without buffer objects, if you just target an older OpenGL and GLSL version. You can even use glBegin/glEnd code with shaders.

Just because you can doesn’t mean you should, of course, but it is a valid approach if you’re incrementally porting an older code base, for example.

Of course you can, glBegin/glEnd is one of the replacements for VBOs… a usual combination is glBegin with glVertex and glVertexAttrib instead of the glTexCoord

What would it matter? “older hardware” and the like that isn’t capable of running shaders… isn’t capable of running shaders and wouldn’t implement GL 2.0, let alone 4.6.

And if it is capable of running shaders… why would there be hardware on the GPU to not run shaders?

Early Geforce 3s did have fixed-function vertex processing alongside shaders, but even the original X-Box GeForce 3-based GPU was smart enough to ditch that nonsense.

… what makes you think that shader-based GPUs can’t do that? Especially since that’s literally what a lot of them are doing right now.

It doesn’t really matter; it’s all outdated garbage that nobody should be using in 2022. Nobody should have been using it a decade ago, let alone now.

“Plain OpenGL API” is custom shaders. And it has been since 2009. Please stop acting like over a decade hasn’t passed since we shoved all that stuff into the “compatibility” box.

If by “new hardware”, you mean "any GPU released since at least the Radeon 9700 (the one from 2002)? Yes.

Probably. There’s no way to know for certain. No IHV is going to spend any particular hardware time specifically optimizing compatibility pathways.

Almost certainly not.

That’s really the bottom line here. There is no reason to believe that using compatibility stuff will improve performance in any significant way. There might be some corner cases, most likely on NVIDIA hardware (at one point, they had a really good display-list-based system for optimizing meshes, but who knows what it’s like today), but even that’s at best a guess.

Just use shaders. The end.

One thing which would have been a factor with older implementations is that mipmap selection was determined by a function which is affine in screen space (i.e. for any given triangle, texture scale factor would be a simple function of window coordinates), as the relationship between window coordinates and texture coordinates was restricted to a projective function.

That ceases to be the case once you introduce normal maps and environment maps; texture scaling can vary wildly even within very small regions of the frame. So the mipmap selection has to be calculated dynamically (which is why GLSL has dFdx and dFdy, textureLod, textureGrad etc).

The fact that you don’t like the previous functionalities or are against them is irrelevant to the post itself, If I want to understand its fundamentals (even if they don’t apply anymore) is my choice and mine alone.
They exist in the API, they are present therefore knowing them is a good way to better understand its usages or avoidances.

explaining to someone that something is garbage and should not be touched just because, is preciselly the idea I want to avoid because it doens’t give a correct learning capability to perform an informed decision.

I’m asking for facts and reasons, not opinions or "because it is"s

Remember, people don’t have all the same knowledge age as others, don’t just say what year was what because studying opengl history is not on my main goals thus far. but knowing its capabilities good or bad ones) are. Assuming I started using shaders 2 years ago, I would say that I really need to know what and why of the API. even the ones others dislike

Again, I never said what I needed/wanted to use one or another, but wanted to understand their differences, learning =/= using, like I said initially: I have also done some tests but wanted to know whats inside the box

Thank you, makes sense, I was expecting something like simpler functions.

OK, here are the facts:

  • Nothing in GPU hardware has looked even remotely like fixed-function OpenGL for at least 15 years.

    • Though to be fair, it doesn’t exactly model modern OpenGL either, which is why Vulkan and the like exist.
  • Compatibility APIs are implemented by doing, more or less, what you would do in the current API. They may be using the underlying interfaces instead of the OpenGL abstractions, but that isn’t buying you performance. Especially since most of the compatibility interfaces (especially immediate mode) are just slow by design, requiring extremely inefficient means of copying data from your application to the GPU (ie: calling multiple functions for each vertex) compared to just doing a DMA.

  • Since most OpenGL applications these days use shaders, driver writers will spend the majority of their time optimizing for shader usage. And correspondingly less time optimizing compatibility API usage.

  • The current GL 4.6 conformance test doesn’t even test compatibility APIs. In fact, I don’t believe there has been a conformance test for compatibility functionality since the original conformance tests for version 1.1.

  • The compatibility APIs are compatibility options and were put into a special box in 2009 by the very people who design graphics hardware and write drivers for them.

  • The compatibility APIs are increasingly becoming less supported. They’re not even available for OpenGL implementations on some Linux open-source drivers.

If you want to conclude from these facts that the compatibility API has value, go ahead. But to me, these facts all add up to “do not use”.

We can’t tell you what’s “inside the box” because there are dozens of boxes. You can ask about a specific piece of hardware and driver, but outside of that, we can only provide general advice. And the general advice is “this is a bad box that we abandoned because it’s a bad box.”