My wish list for OpenGL next versions Response to official feetback topic about OpenGL 3.2

Instead of duplicating my wish list.
The link above redirects to my OpenGL 3.2 official feedback thread reply.

i do agree that tesselation should eventually make it’s way to core and that binding stuff has to be rethought and unified, it’s not that it’s particularly bad, it just needs some work as sometimes it’s just saying “use this”, but at other times it’s just a starting point in the multiple steps you have to take before you can start rendering.

but for the rest, no, why?

Yep, the next version of DX is more or less the proverbial writing on the wall, at least where major features are concerned. You needn’t look much further than DX11 to see what’s in store for the next several GL3 increments (not that there isn’t room for originality or uniqueness here and there, as was clearly demonstrated in 3.2’s sync and way cool uniform blocks).

The binding system is bad. It can be done simpler.

Every extra added bit of programming details makes it harder.
You already use your brains a lot in writing code.
Sure, it’s simple. Yeah when you’re doing simple projects.
I don’t want to get frustrated in complicated projects because of these kind of things.

I don’t want to remember another thing when I’m making complicated rendering engines. It slows me down and makes things unnecessary more complicated. Every bit of complicatedness is an extra burden for the brain.
And every burden that can be removed is a burden to much.

By unifying the different OpenGL versions I mean the API’s.
The specification may be completely different and have different functionality.
There is nothing wrong with that.
As long as the API’s for the same functionality are the same.
I completely agree with Alfonse Reinheart on this:

The programmers work with the api’s, and if the api’s are the same. Porting is very easy.

All the rest is actually for being able to do major revisions without conflicting (OpenGL ES 1.0 and 2.0 currently) versions.
And have backwards compatibility as an option.

Anyway, there are two more posts about my wish list:

A feature request about precision.

Bundling of pipelines/stream processors for acting as a wider, more precise pipelines/stream processors. Enabling dynamical, programmable precisions.

Have seen that with GLSL, shaders can be coupled (coupled in serie) to do multiple effects.

What if you could couple pipelines in parallel for enhanced precision?
Not for parallel processing, just adding precision.
e.g. couple eight pipelines with full 32bit accuracy for each component to one combined pipeline with 8*32bit = 256bit accuracy.

This has the advantage to be very scalable with a good specification.
If there is only one pipeline then the pipeline will just have to take more time in calculations and store data in cache memory.
If there are pipelines left because of the size of the combined pipeline (e.g. 5*32bit on 17 stream processors/pipelines leaves out 17modulus5 = 2 left),
no problem, then so be it.

This would allow dynamical, programmable precision which is useful, welcome (essential?) in several area’s.
Physics simulations for instance.

But also in more mainstream applications.
Position calculations in games on huge maps without running into precision issues. (It’s going to be slower than less precision but at least it is possible to have good animation and movement.)

Got this idea after thinking about precision problems in physics simulations and the fact that shaders can be coupled after each other. That each pipeline has a certain precision and the current graphics processors have a lot of them ATI: graphic cards with 800 stream processors/pipelines.

Position calculations in games on huge maps without running into precision issues.

Things like that almost never come up. In games with huge maps, the renderer will likely have culled objects that are that far away.

Further, if you’re rendering objects that far away, then your 24-bit depth buffer isn’t sufficient. And that is hardcoded; replacing it means losing performance features like Early-Z and Hi-Z/etc.

This seems more like something that should go into OpenCL as an option.

“Have seen that with GLSL, shaders can be coupled (coupled in serie) to do multiple effects.”

What are you talking about?? This is not possible and certainly will never be. You have to program such stuff yourself.

About the precision stuff: This is all OpenCL area. OpenGL/GLSL doesn’t (want to) know anything about stream processors, pipelines and all those nasty “details”.

If you want to do physics on the GPU, use OpenCL / Cuda, not GLSL, it is intended for such purposes.

Also, i am not sure one can use “more pipelines for more precision”. I don’t think the ALUs are designed to allow computations to be done that way.


My wish list would be stop where you are until we have reliable drivers that support the current spec., and from major vendors at least. :slight_smile:

Stop? Please, don’t stop! (oooooh…eeeeeeee…ahhhhhh!)

:wink: :eek: :smiley: :cool:

Define sarcasm :slight_smile:

How to advance on the driver side is more important I guess.

BTW what’s the current version Apple support?

A guess, but I think he may be talking about huge arenas/worlds where float32 precision isn’t sufficient to represent world-space positions without insufferable precision errors.

The errors computing MODELVIEW in this case can be often be avoided by doing your matrix math in float64 (double) on the CPU.

However, that still doesn’t mean you can go messing with world-space in your shader to the accuracy you need. You have to resort to local coordinate system tricks for that, since shaders can’t do float64.

Wasn’t being sarcastic. Damn the torpedoes… full speed ahead!


That’s the stuff I’m talking about.
Dark Photon is correct.
How would you do physics calculations that could generate and error or unsolvable big problems when precision isn’t high enough?
(can happen, these kind of physics models exist, don’t tell me it can’t because there can always be bugs in someone’s code.)

The program Celestia, which is a free and open source space simulator. Can simulate the universe (simplified somewhat) and renders a lot of stars. Many people want to place stars and planets in far away galaxies in addons. But they can’t because of accuracy issues. The stars would get stacked on top of each other. Or orbits with spacecrafts, single precision is not good enough for these. And in some future cases, maybe even more precision than 64bit floating point is needed. It is also a problem when using the telescope feature, trying to view exoplanet stars and far away bodies in our solar system is a problem.
For zooming with the telescope feature on a spacecraft that is located in a solar system addon for Celestia, in a galaxy cluster 10^16?10^20 light years away requires definitely more precision than any current datatype can foresee (even more than float128). :mad:


I try to never say:
I don’t see a use now, let’s consider it never is needed.

It’s like the people who say.
If I can’t see it, it doesn’t exist!
(Then they should be able to see atoms, but nobody can.)
People who say this are very short-sighted and egocentric.
Those people should be ashamed of even suggesting this sort of anti-progressive behaviour.
I don’t need this [censored] and whining about how it’s not useful, get over it and realize other people might have other needs.

This forum is about discussing what could be improved, added in OpenGL.
Not just about whining about and adding the missing stuff compared to the newest DirectX version.

And by using 64bit datatypes, the precision is just better, not adjustable to everybody’s needs.
I don’t need it currently, but maybe someone will need it.
And it’s important to realize that!
I don’t know for sure, but neither can you be sure it is not needed.

Datatypes and parameters:
Here is the solution to precision problems and also encoding problems in datatypes.
Datatypes with parameters!

(These examples are just for illustration.)
integer: int(32) /*an integer with 32bit, one bit is reserved for the sign /
int() /
an integer with a default value, could be 32bit signed /
int(u,16) /
an unsigned integer with 16 bit /
int(512) /
an integer with 512bit */

float: float(256) /* a float with 256bit */

/* About strings, there is the encoding issues.
There are a lot of encodings and you just don’t know which one the language is using under the hood. Or want to force a certain encoding. String encoding parameters can add this kind of flexible behaviour. */

string(UTF8) /* a string with UTF8 encoding /
char(UTF8) /
a character with UTF8 encoding */

These things also count for OpenCL.
They solve the problem that sometimes it’s not clear how many bits the compiler reserves for the data types, and exchanging source code produces different results on different computers. Making debugging more difficult because more parameters are involved.

binding system once again

The binding system is totally useless.
Binding can be improved, replaced by atomic operations.
Binding makes code larger and there for harder to debug
Binding takes in space while being completely unnecessary/replaceable with something better.
Binding system is bloat.

There is a problem with expectation, how the people see 1.0 and 2.0 relate. The Fixed Function pipeline should be noticeable in the name for clearness and avoiding confusion among the general public. Ignoring this can harm OGL’s reputation.

How would you do physics calculations that could generate and error or unsolvable big problems when precision isn’t high enough?

Use OpenCL. If you’re doing physics calculations that are serious enough to need precision greater than a 32-bit float, you probably need a lot of things OpenCL provides that OpenGL does not.

Well, 10^20 light years are 9.46e35 metres. 2^128 is ~3.4e38, so 128 bit integers/fixed point have a fixed precision of 2.8 mm over that distance. That’s quite a lot, considering that no data measured at such distance can ever be that precise.

Anyway, you can’t just “bundle pipes” for more precision. Even the simplest example, a 2N wide integer addition with an N bit adder, needs to propagate the carry bit and would thus have to be performed in sequence. A wide multiplication requires several narrow multiplications and additions operations. Combining floats to form a wider float is a lot more complicated (and probably pointless if you have integers), as is any operation beyond multiplication and addition.

If you are willing to spend the time working out how to do it, you can do all that in sequence in one pipeline. No need to bundle them if you’re processing multiple vertices/fragments in parallel anyway.

With older cards I will only be able to use lower precision with OpenGL.

Combining a lot of pipes will make the pipe slower, that’s the way hardware works. Duh!
The point is that it is going to be faster than doing it in steps with software. Every card will have a maximum of course. But once this is present, older cards can have a lot of precision, thus making developers work easier in the future.
Your saying that if I bundle two 32bit pipes or if I use a 64bit datatype, there is the same delay theoretically.
The system-software/hardware is still going to have to do all of the necessary things in some way.

There is a large difference with software doing that in sequence or having the hardware be able to do that.
The hardware acceleration will give a difference, will be faster!
How much? I don’t know. It could be significant in some situations.
(Search for float64 operations per second benchmark comparisons on 32bit and 64bit OS’s on 32 and 64bits CPU’s.)

A better illustration is that now, developers have to make all sorts of algorithms to solve the precision problems.
For situations where older hardware can’t be ignored!

A lot of older graphic cards have single precision, this makes it a lot more difficult to do big things in them.
It’s possible to write code that uses certain techniques to proces things in steps and be fast. Allowing bundling can avoid having this problem, drawback a few times in the future.
And it enables having better performance as described above in this post.

My point is that it is slower than if the card can bundle it’s pipes adding it’s way more difficult to program extra algorithms to handle older hardware in steps and be fast.
How the internals work will be up to the IHV’s and driver manufacturers, how it will behave will be up to OpenGL specification. OpenGL can also dictate a minimum and/or maximum for clarity.


I’m not talking about performance but about hardware complexity. You can’t simply connect two single precision FP ALUs to form one double precision FP ALU, that’s just not how it works. You seem to have no idea how expensive it would be to actually implement your suggestion.

(Search for float64 operations per second benchmark comparisons on 32bit and 64bit OS’s on 32 and 64bits CPU’s.)

The execution speed of double precision operations on the FPU has absolutely nothing to do with the OS running in 32 bit or 64 bit mode.

It makes a difference application side.
I’m well aware that this suggestion is very complex to realize in hardware. This is an idea’s list. I just let my imagination go freely.

In the end, on the application speed the OS matters, I just included it for completeness. A difference OS can also have different drivers and a lot of other stuff that makes a difference.

While textures are very mature.
(Here comes my next suggestion.)
Something to put vector based textures in memory.
Not as bitmaps but as vectors. Also with the ability to do transformations and stuff on it. This is something that more belongs to OpenVG. But it will probably used in OpenGL also a lot. And this is a list I started in OpenGL.
OpenGL support for vector textures, based on svg 1.2.

(Eventually when going on-screen, showing. The necessary parts will be rasterized for the framebuffer in another texture or directly to the framebuffer itself.)


*GetString(enum name);

Is this usable everywhere?
If not please allow it to be used everywhere in a program.