How does OpenGL store it's states?

Originally posted by Michael Steinberg:
Hey, I think you’re exagurating… After all it’s about little states! you can simply use a char for a state. We aren’T living in the time where we have to use only two digits for a date!
You can have 1024 states, and they will still be only 1kb…
And yours in slower then.

Were you responding to me? Its not always such a clear cut case as that. Even if we are dealing with 1024 states, yours would need a whole 1KB to remain in L1 cache, while mine would only need 32 bytes. A Pentium 3 only has 16K of L1 Data cache. Guess which version is more likely to stay put. A couple of shifts and bitwise operations could easily be faster than a cache miss. Of course, the extra instructions would also make the code more susceptible to an instruction cache miss, so it depends on the situation. If you have enough bits and/or you use them often enough, I could easily see my way being faster.

You gotta love this type of stuff. You cant prove my way is slower, I cant prove its faster. It depends on the situation.

Originally posted by ffish:
Seriously though, I guess the problem boils down to your second line: each state needs a unique value. The problem with the GL states is that there are so many of them that you can’t have a bit pattern that doesn’t overlap others.

Well, for boolean states, you can get 4 billion unique states. That should be plenty.

Also, as you mentioned, your solution works for bools but not multivalued states

And in openGL, you have different methods for getting boolean states (ie: glIsEnabled) and multi-valued states (ie: glGetIntegerv).
You could uses this method as a wrapper for the boolean functions, and another system to wrap the multi-valued functions.

To tell you the truth, I dont always use this method…it depends what Im doing. If I have structures that I need to copy around a lot, or if I have an object with 50 states, I use this technique. For things that only have 5 or less states, I often use plain boolean values. It really depends on the situation.

Oh yeah… these caches again… I’m not too much into hardware yet… so I say a sorry.

Still I think as long as you won’t inline these functions they will be slower. Function call overhead… Stacks etc…

Hey, ffish! also have Game Programming Gems. Can you explain me the trick about this singleton pattern? What’s it all about?

Originally posted by LordKronos:
And in openGL, you have different methods for getting boolean states (ie: glIsEnabled) and multi-valued states (ie: glGetIntegerv).
You could uses this method as a wrapper for the boolean functions, and another system to wrap the multi-valued functions.

glGet and glIsEnabled will surely kill your performance, you souhld build your own wrapping methods to keep track of state chances.

Chris

Oh and… is there any program which simulates some cpu? So that one could watch the caches content like debugging.

Michael,

I don’t know if you have a German version, but in my book on pp36-40 there is a chapter on singletons. I use the method titled “An Even Better Way” and just use that to create subclasses. For example, I have a class called glStateMgr defined like this:

#ifndef _GL_STATE_MGR_H
#define _GL_STATE_MGR_H

#ifdef _WIN32
#include <windows.h>
#endif

// Surround OpenGL C stuff with an extern declaration.
#ifdef _cplusplus
extern “C” {
#endif
#include <GL/gl.h>
#ifdef _cplusplus
};
#endif

// Include the singleton base class header.
#include “singleton.h”

// OpenGL state manager class. It is much faster to manually keep track of
// state variables than to use glGet* functions (very slow) or to redundantly
// update state variables with their current value (slow).
class glStateMgr : Singleton {
public:
// Constructor/destructor.
glStateMgr();
~glStateMgr();

// Turn on/off a capability.
void glDisable(GLenum cap);
void glEnable(GLenum cap);

// Set the matrix to be manipulated.
void glMatrixMode(GLenum mode);

// Controls how color values in the fragment being processed (the source)
// are combined with those already stored in the framebuffer (the destination).
void glBlendFunc(GLenum sfactor, GLenum dfactor);

// Sets the comparison function for the depth test.
void glDepthFunc(GLenum func);

// Sets the shading model.
void glShadeModel(GLenum mode);
private:
// Forward nested class declaration means that
// I don’t have to #include
// until the .cpp file.
class capsMap;

// Capabilities on/off state for the glEnable/glDisable calls.
capsMap& _capsMap;

// Current depth comparison function.
GLenum _gl_depth_func;

// Current modifiable matrix.
GLenum _gl_matrix_mode;

// Current shade model.
GLenum _gl_shade_model;

// Current source blending factor.
GLenum _gl_src_blend_func;
// Current destination blending factor.
GLenum _gl_dst_blend_func;
};

#endif // _GL_STATE_MGR_H not defined.

This is an early prototype. It’s not nearly as optimised for space as LK’s but it’s easy to understand.

I use it to keep track of the states you can see, as well as a bunch of glEnable() states too. In this project I only need these states so far, but I will extend the class to include as much as possible eventually.

Hope that helps ,
Toby

PS I think the only way to know about optimising for the CPU is through experience. Some excellent books that I have on low level optimisation are Inner Loops by Rick Booth and Michael Abrash’s Graphics Programming Black Book. Both are a little old but good reading. The best optimisation comes from choosing good algorithms first though.

Yeah, I read about it (there is no german version ). But what’s the purpose? That’s what I didn’t understand.

Oh, it just ensures that there is only ever one object initialised of that type. I am just a fan of Design Patterns by GOF and “correct software engineering” crap. It will mean that you can’t have for example:

// This won’t compile.
glStateMgr *state1 = new glStateMgr();
glStateMgr *state2 = new glStateMgr();

state1->glEnable(GL_DEPTH_TEST);
// Second call is redundant.
state2->glEnable(GL_DEPTH_TEST);

I guess what I use it for is to remove any chance of for example calling glMatrixMode(GL_MODELVIEW); every time through the render function even though it is never changed to another matrix mode. For my current project it is possible that I may make stupid mistakes (hopefully not that stupid!) since I will have 10’s of files and 10,000’s lines of code and many different rendering functions depending on user defined runtime variables.

I guess I just implemented it for the hell of it. I was reading that chapter and like the idea of singleton classes

So I don’t have to use that thing if I swear to myselft that I’ll be only using one instance of a certain class?

LOL!

No you don’t. It’s just me being pedantic.

Originally posted by Michael Steinberg:
So I don’t have to use that thing if I swear to myselft that I’ll be only using one instance of a certain class?

You definetly don’t need it if you can also swear that you never unnecessarily enable or disable any states or set any Colors or Functions twice in a row to the same values.

Chris

[This message has been edited by DaViper (edited 04-17-2001).]

I swear!

I can’t

Why?

I’am having different rendering functions, so I have to reset all my states everytime I enter one of those functions.

Originally posted by Michael Steinberg:
Oh and… is there any program which simulates some cpu? So that one could watch the caches content like debugging.

Well, I dont know of anything that can do this (it would be incredibly complex, and even more incredilbly slow). However, with the Intel VTune profiler, you can measure things like cache misses, so this might tell you what you are looking for.

Another thought about my method. If the OpenGL constants are too spread out, any technique that uses the constant as an array index will have problems. Example: if one constant is “1” and another is “123456789”, your going to need an array of 123456789 elements.

Instead, you would need some system (like the switch statement I talked about earlier) to remap a constant into an array index.

P.S.
Or an even better option performance wise is to make a higher level manager (this is what I do). In other words, have functions that are only concerned with texture state, other functions that are only concerned with lighting state,…matrix, arrays, etc.

The downside to this is that it isnt quite as transparent as “myStateManager->glEnable(…)”, but if you start with this idea in mind, it works quite well.

[This message has been edited by LordKronos (edited 04-17-2001).]

(originally posted by ffish (i think); somehow messed up this)

// Some hardcoded feature value.
#define SOME_FEATURE 0xSOME_BIT_PATTERN

// Given some global state variable that
// holds the system state.
inline bool isFooEnabled()
{
if ((SOME_FEATURE & global_state_variable) == SOME_FEATURE) return true;
return false;
}

or even better:

inline GLenum isEnabled(GLenum state)
{
return SOME_FEATURE & global_state_variable;
}

(end quote)

Well i’d write it that way:

#define IS_ENABLED( var, feat ) (feat & var ? 1 : 0)
or
#define IS_ENABLED( var, feat ) ((feat & var == feat) ? 1 : 0)

and then just

if( IS_ENABLED( your_var, YOUR_FANCY_FEATURE ) )
{…}

It’s easier than alot of inline functions i think (but thats my personal view)

BTW: the same works for key-checking in window$ too:

#define KEYDOWN(key) ((GetAsyncKeyState(key) & 0x8000) ? 1: 0 )
#define KEYUP(key) ((GetAsyncKeyState(key) & 0x8000) ? 0: 1 )

[This message has been edited by CViper (edited 04-17-2001).]

You could use std::bitset<N> if you are using C++ for bit sets of arbitrary length.

Either that or std::set<int>, the search time is logarithmic and you only use as much memory as the number of nodes you store.