Higher C++ interface for OpenGL

TheSillyJester · December 21, 2003, 9:53am

A more OO design (mostly removing binding where it’s not needed) may allow easier multithreading, no?
One thread can upload some textures and another render some others textures at the same time.

zeckensack · December 21, 2003, 10:13am

Originally posted by TheSillyJester:
A more OO design (mostly removing binding where it’s not needed) may allow easier multithreading, no?
One thread can upload some textures and another render some others textures at the same time.
Once again, your graphics chip isn’t multithreaded, neither is your AGP. You may be able to execute to CPU threads truly simultaneously, but you’ll have to serialize for the graphics hardware. Doing these things “at the same time” isn’t possible.

zeckensack · December 21, 2003, 11:06am

Originally posted by Korval:
No, if glGen* had control over the “names”, then it could generate names as it saw fit. The names in question could be actual pointers, or something that converts into a pointer after one quick memory access, or a relatively short lookup.

Because GL is forced to accept any object name regardless, the implementation must have some way to map any arbitrary object to a pointer to the internal object. Rather than having a simple function or even a cast operation, it becomes a complex search operation.
You’re trading in robustness. It’s easy to shoot down a direct pointer conversion, if you’re so inclined. GL’s texture object model can’t be shot down.

Pointers also limit your ability to dynamically reallocate memory. A GL implementation might have a live context for hours, or even days, and see millions of object creations/destructions during that time.

There are only few restrictions you can put on pointers for “validation” purposes. You can enforce an alignment, you can do memory range checks but that’s all very limited. And …

You can’t safely move an object in memory if there are pointers still referencing it in its old location, and the GL implementation has no knowledge of client references. Ie you must keep it there until it’s properly destroyed. You can’t do garbage collection anymore (an implementation might want to do that once per frame or so). You can badly fragment your memory. I wouldn’t want to impose these restrictions. Name lookup can be strictly O(log2n) and is robust. Funneling pointers through the API doesn’t offer enough benefit for its drawbacks IMO.

I won’t complain if automatic object creation is removed, I never really needed it. But I’m unsure of whether there’s a real benefit. But I’m all against pointers.

Kind-of. I would actually prefer a debug and release version of the implementation. In debug, it can do glError and so forth checks to make sure that the texture object name really exists. However, in release, it should not even bother; just produce undefined behavior/crashes. Granted, that’s somewhat wishful thinking, but it would provide a negligable speed increase in situations where bindable object state is constantly in flux (which is, admittedly, not that frequent).

Think about Win32 programming (if you’ve ever done any). HWNDs are handles to a window. You have to call a function to create a valid one. If you call a Win32 function with an invalid HWND, it will fail (in debug, with an error of some kind). You don’t call a “bindHWnd” function; you just use the current one. It doesn’t impose much overhead in terms of searching because the contents of a HWND are controlled by the OS. It can put whatever info in a HWND that it takes for search/validation times to be low.
Yes, I’ve done Win32 programming. I don’t think the comparison is fair, because typical Win32 programs aren’t particularly dynamic and frankly, Win32 hasn’t been designed for speed. Creating a dialog might get you a few hundred HWNDs (lots of controls and stuff). The vast majority of programs will keep it there and destroy the whole dialog at once, they won’t destroy and replace single controls over and over again.

HWND is in fact a plain pointer to struct. All I’ve said above about pointers applies. It doesn’t matter much for a window manager, but IMO it sure does matter for a low levelish graphics API.

The real question is, if you had OpenGL to write all over again, from scratch (as a C-based API), with no consideration as to backwards compatibility, would you continue to use the current paradigm or would you switch to the one used by shader objects? I think, if the ARB had it to do over again, they’d go for the shader object (ie, object-based) version.
I’d chose the current paradigm for sure. Maybe ditch the automatic creation of objects. Otherwise I think it’s the perfect blend of performance and stability.

Korval · December 21, 2003, 12:16pm

Function overloading would cut down a lot of names now wouldn’t it?

Syntactical sugar, at best. Once again, it’d be nice, but then you lose everything that a C-based library gets you. Namely, portability.

I see the pipeline as a stream so << it is, it might sound strange to those who don’t programm in C++ and stl but for those who do, I think it will make sense…

The pipeline may seem to you like a stream but it isn’t. Esepcially when you start getting into vertex arrays and so forth.

A lot of state management could be handled in the client side, and you wouldn’t pay the overhead of glGet functions, not to mention easier debugging

Which would make the implementation slower, since the state isn’t server-side where it needs to be.

. Be able to switch implementation directly
you want lists? create a stream with lists
you want arrays? create a buffered stream

And neither interface is approprite for performance rendering.

I do not expect programmers in C to like the idea, I expect programmers in C++ to like the idea.

OK, then as a C++ programmer, I think it is a bad idea. BTW, I never use the iostream API; I prefer printf-esque functions. You don’t have to use all of C++ to be considered a C++ programmer.

However, I think that a low level C++ wrapper would be very useful with the proposed 3Dlabs interface for pure OpenGL 2.0; as there is a lot of abstraction that can be made to make life easier without any performance drop.

But most of GL 2.0 will likely never see the light of day. The only real thing we’re likely to see out of GL 2.0 is glslang.

Pointers also limit your ability to dynamically reallocate memory. A GL implementation might have a live context for hours, or even days, and see millions of object creations/destructions during that time.

Not really. The pointer points to an object that contains whatever client-side data that the implemenetation needs to refer to a server-side object. For a texture, it would be a the location of the texture’s data. For a VBO, it would be the location and type of the buffer. And so on.

You can still move the server-side data around without having to change the actual client-side object.

There are only few restrictions you can put on pointers for “validation” purposes. You can enforce an alignment, you can do memory range checks but that’s all very limited.

Or you can keep a table around in memory that contains a list of all allocated objects. Which is, basically, no different than it is now, except that the table isn’t used to convert a name into a pointer, but just for validation. So, in a hypothetical “release” mode, this table wouldn’t need to exist.

Creating a dialog might get you a few hundred HWNDs (lots of controls and stuff). The vast majority of programs will keep it there and destroy the whole dialog at once, they won’t destroy and replace single controls over and over again.

The same is true for OpenGL programs. You load a block of textures and VBO’s. For a game, this might represent a level. Once the level is over, they all get destroyed. Even games that stream levels destory their objects in blocks, rather than individually.

TheSillyJester · December 21, 2003, 2:15pm

Originally posted by zeckensack:
[quote]Originally posted by TheSillyJester:
A more OO design (mostly removing binding where it’s not needed) may allow easier multithreading, no?
One thread can upload some textures and another render some others textures at the same time.
Once again, your graphics chip isn’t multithreaded, neither is your AGP. You may be able to execute to CPU threads truly simultaneously, but you’ll have to serialize for the graphics hardware. Doing these things “at the same time” isn’t possible.[/QUOTE]

I didn’t say it’s faster, the driver can serialize the way he want (not sure there isn’t room for some optimizations), but it’s more convenient to program. And if there isn’t room on the GPU putting a texture on CPU memory can be done asynchronously right?

imported_jwatte · December 22, 2003, 11:07am

Thread synchronization is expensive. If you force drivers to synchronize, you will never get peak performance out of your hardware.

Anyway, regarding the intial question: what specific problem are you trying to solve?

Nobody should be pushing OpenGL state management into a number of different objects, so that they would step on each others toes already. Visiting your scene graph to extract geometry can be done using all kinds of C++ template pattern goodness already. How to turn geometry + state into an actual rendered device is best done with a separate strategy. At the bottom, there’s a renderer, but any new abstraction doesn’t really seem worthwhile at that point.

If you want a new, higher-level API, you probably want to define your own scene graph (extracting geometry + rendering strategy) and implement it on top of OpenGL, rather than just re-mapping all the current OpenGL bindings to a higher-overhead, still low-level API.

Overengineering for overengineering’s sake is seldom useful.

SirKnight · December 22, 2003, 12:54pm

I’m with cass on this subject. Also, putting uncessesary overhead ontop of your opengl functions is never a good thing. I don’t see where we need something on a higher level than glVertex4f(…), for example, anyway. One thing that has always bugged me is when someone would create an “opengl class” where its member functions are exactly the same as the opengl counterparts. Like this for example:

class OGL
{
public:
void Vertex3f(blah blah);
void Verted2f( blah blah);
void Color3ub( blah blah);
};

etc…

Then:
OGL myOpenGL;
myOpenGL.Vertex(blah);

I mean wtf is the point of this? It’s more to type for one thing. Then if you start doing weird things with operator overloading…well say good bye to decent performance once you do that. Forget about it. To me it seems like, unless im wrong, the point of OpenGL is to draw 3d graphics fast, not, let’s try to make everything over simplified sacrificing half of my rendering speed.

Anyway, carry on.

-SirKnight

[This message has been edited by SirKnight (edited 12-22-2003).]

JanHH · December 22, 2003, 4:01pm

in 1997 when studying computer science, in the first few hours of the practical programming course where we were supposed to learn Smalltalk, I asked the professor if the way smalltak operates doesn’s slow down performance. He nearly became angry and said that I should not think and talk so small minded about this. That’s the way those people think, performance (in terms of really highly optimized performance in inner loops) is not an issue, good software design is MUCH more important.

In general, this is right, but OpenGL is a good example where optimized design and optimized speed are contradictions. So in order to get optimized frame rates, we have to live (and in fact whe shoud at least know that we do so) with the fact that we are not writing the best OO designed software.

In this forum once there was a flame thread about the very bad quake 2 source code, but, does anyone who play the game care about this?

An example for OO OpenGL (OOOpenGL g) is the gl4java bindings, this goes like

GL myGL;

myGL.glBegin(…)

and so on.

Another issue (as I already said) is the state machine chaos, an object oriented solution to this would be great, and functional overhead to state changing calls would not cost much performance. But this is a logical contradiction as an object structure encapsulating this would have to know about all OpenGL states that could be changed, including extensions, and including all that are yet to come (or it would have to be uptdated very very often).

Jan

JanHH · December 22, 2003, 4:11pm

hm I am just thinking about something like a GLStateManager that keeps track of state changes and helps me with the source code chaos in my terrain enigne…

It would have a default state and when a part of the OpenGL scene sets some states it would automaticaly reset them to default state when this part is finished. It would look like this:

myGLStateManager.BeginSection();
ground->Draw();
myGLStateManager.EndSection();

myGLStateManager.BeginSection();
clouds->Draw();
myGLStateManager.EndSection();

and inside of ground->Draw and clouds->Draw OpenGL states can be changed wildly withoug having to memorize what was done and having to reset it.

Maybe this is a good idea, at least for my program…

Jan

zeckensack · December 22, 2003, 9:47pm

Originally posted by JanHH:
[b]hm I am just thinking about something like a GLStateManager that keeps track of state changes and helps me with the source code chaos in my terrain enigne…

It would have a default state and when a part of the OpenGL scene sets some states it would automaticaly reset them to default state when this part is finished. It would look like this:

myGLStateManager.BeginSection();
ground->Draw();
myGLStateManager.EndSection();

myGLStateManager.BeginSection();
clouds->Draw();
myGLStateManager.EndSection();

and inside of ground->Draw and clouds->Draw OpenGL states can be changed wildly withoug having to memorize what was done and having to reset it.

Maybe this is a good idea, at least for my program…

Jan

[/b]
Ummm …

glPushAttrib(myGround->GetStateBitsModifiedByDraw());
myGround->Draw();
glPopAttrib();

That’s very brute force, just like your approach, but it can be done.

My gut tells me that the GL can Push/Pop state more efficiently than you can manage it externally, though that depends on implementation quality, obviously.

Changing state redundantly isn’t a good idea anyway. What you’re really looking for is a good scene graph that can render objects in a “minimum state change” order of sorts. Basically the same thing as the travelling salesman problem.

JustHanging · December 22, 2003, 10:06pm

I think a simple state wrapper with early reduntancy checks is a good idea anyways. During the developement time you could also have it report the reduntant state changes, so if possile you can clear them up from the code too.

-Ilkka

thAAAnos · December 23, 2003, 1:28am

Why do you keep telling that any wrapper will have impact on performance?
OpenGL is fast when it comes to rendering, transforming, clipping and so forth.but are you certain that OpenGL calls are lighter than any wrapper function that could well be inlined? What would be the performance impact
no more than 1% to be sure…

My gut tells me that the GL can Push/Pop state more efficiently than you can manage it externally, though that depends on implementation quality, obviously.

Have you forgoten that your cpus run at 3GHz?
I think of OpenGL calls as I/O calls, correct me if I am wrong please…
Doesn’t it pay to avoid some calls when possible?

Zengar · December 23, 2003, 2:14am

OpenGL has such a simple API that I see no use to write wrappers for it. Helper classes, yes, but not a wrapper.
So far agree with Tom.

Tom_Nuydens · December 23, 2003, 2:45am

Originally posted by thAAAnos:
Doesn’t it pay to avoid some calls when possible?

Yes, but the way to do it IMHO should be by clever application design, not by putting an extra layer between the app and the GL.

– Tom

zeckensack · December 23, 2003, 2:57am

Originally posted by thAAAnos:
Why do you keep telling that any wrapper will have impact on performance?
OpenGL is fast when it comes to rendering, transforming, clipping and so forth.but are you certain that OpenGL calls are lighter than any wrapper function that could well be inlined? What would be the performance impact
no more than 1% to be sure…
Sure, a straight inline wrapping is just as fast as the pure API. No doubt about it. It only gets interesting if you layer something on top, instead of just wrapping it up. In particular if you add functionality that the GL itself can potentially do better.

Have you forgoten that your cpus run at 3GHz?
No, I haven’t. I only said that the GL implementation may be able to do it faster, not by how much, nor did I claim that Jan’s idea would be “too slow”. I just felt it was reinventing the wheel.

I think of OpenGL calls as I/O calls, correct me if I am wrong please…
Doesn’t it pay to avoid some calls when possible?
I’d rather think that OpenGL hides the actual I/O calls from the programmer, but that’s only a matter of interpretation
Sure it pays to avoid calls. That’s the reason I suggested letting the GL do it in the first place.

Eg
glPushAttrib(GL_DEPTH_BUFFER_BIT);
<…> //do something
glPopAttrib();

That pushes/pops state set through these:
glDepthFunc
glDepthMask
glClearDepth <- let’s ignore this one for a moment
glEnable/glDisable(GL_DEPTH_TEST);

These individual functions all need error checking on the parameters while PopAttrib doesn’t need to do this. Only “current state” is pushed and that’s guaranteed to be error free (almost; AFAIK WGL_ARB_make_current_read is the only exception where PopAttrib can possibly produce an error).

Also, there isn’t necessarily a direct mapping to hardware registers between these functions. Suppose your hardware has three registers for managing its z buffer/depth testing:

1)depth test function
2)z buffer reads
3)z buffer writes

With glDepthFunc(GL_ALWAYS) you never need to read the depth buffer, so you could turn the reads off in hardware (saving precious bandwidth).
With glDepthFunc(GL_EQUAL) you never need to write the depth buffer. You ‘usually’ control that with glDepthMask, so you see that distinct GL states can overlap and affect a single state in the hardware domain.

The opposite is also true. Some GL states can affect multiple hardware registers, such as glDisable(GL_DEPTH_TEST), you never need to write (nor read) depth. It would affect all of our three supposed hardware registers (setting the function to “always”). We do not even have a register for “enable/disable depth test”.

To figure out the hardware register values, we need to look at all three states together. This sort of interaction may be managed more efficiently by Push/Pop than by calling the individual entry points. And that was a simple example

Texture object to name lookup is another thing that can possibly be faster to push/pop than to set through the API. Instead of redoing the “search for pointer to object data”, the implementation is free to “cheat” and push the pointer along with the texture name (but caution must be taken to handle the case where the object has moved in memory or has been destroyed).

Etc and so forth

thAAAnos · December 23, 2003, 4:11am

As a conclusion some thoughts…

Maybe there should be a clear distinction between client side and server side API calls, and maybe not everything should be in the server. I don’t know how the hardware guys feel about that but I have a feeling that they will end up having the program executed by the gpu

I think that it pays to have a wrapper that even does nothing than simply call the API functions, it should be a start for a better interface, and a diffrent view of things, and
it might evolve to an API in itself.

Maybe we need C for portability, but I don’t deem C++ programs writen like in C portable, ie using printf instead of iostream for me is not portable, C and C++ are diffrent languages in my mind. And without a C++ wrapper/interface/API I cannot program with OpenGL and feel safe that no matter future changes to the API I will not have to redesign my program to get the best performance available.

And in the end if we care only for performance then we better forget about API’s
and instead develop a GL language that gets compiled and executed in the gpu, don’t know what this will do to design though…

JanHH · December 23, 2003, 5:42am

I didn’t know of th push/pop attrib commands. they of course make my idea rather superfluous. I need only two calls,

myGLStateManager.InitDefaultState()

and

myGLStateManager.ResetToDefaltState().

I think this would make my program less buggy, easier to develop and will not cost any performance (as state changers are not done in inner drawing loops).

Jan

system · December 24, 2003, 3:38pm

>>>And in the end if we care only for performance then we better forget about API’s
and instead develop a GL language that gets compiled and executed in the gpu, don’t know what this will do to design though…<<<

You need an API.
No API means you can’t program for it.
The alternative is support at the language level.

That wont happen because API’s exist that do the job well. Of course, there is always someone who will go the distance and do crazy things.

The idea of running a program completly on the GPU has come up.

I think that in the very long term, this one will realize. And I think NVidia will be the first.
I think of them as the Microsoft of hardware (meaning they have the resources)