Since I have been running into these bugs as well, and AMD isn’t forthcoming about when / if they have fixed anything, nor have they released any BETA drivers for awhile, it is nice to have a tally of the current state of the openGL drivers, so when people wonder why something doesn’t work, it may not be your code, but it could very well be driver bugs instead.
I’m continually fascinated by how far Intel have come in these tests. Their performance is getting close to competitive with other low-end GPUs (particularly in the mobile space) and when it comes to features I’d personally rather see “unsupported” than “fail”.
I’ve come to really like Intel for two reasons: they crontribute heavily to the development and the progression of the Linux graphics stack and they have the most open documentation of hard- and software out there. Another thing is, unlike AMD and NVIDIA, who have to care about many legacy applications and respective clients, Intel decided to not implement ARB_compatibility - which is awesome.
Fun (or more or less depressing) facts about the big three here - in case you didn’t see it already.
[QUOTE=thokra;1259425] Another thing is, unlike AMD and NVIDIA, who have to care about many legacy applications and respective clients, Intel decided to not implement ARB_compatibility - which is awesome.
Yes, truly awesome. The old legacy apps don’t work on old Intel because it’s garbage and they don’t work on new Intel because it doesn’t have the compatibility profile.
Believe it or not, some developers are stuck with such software - and thanks to this crap we can’t do a gradual upgrade to modern features. It’s either all or nothing. So nothing it is and they stay as they are because a rewrite isn’t feasible.
You are aware that feature removal only takes effect if you explicit request a 3.1+ core context when using Intel’s DRI driver, right? If you create a 3.0 context, you can still use every single feature OpenGL 2.1 and OpenGL 3.0 provide plus a load of extensions.
Saying you can’t gradually port from legacy to 3.x is simply nonsense. Of course, if you want to use 3.1+ core features, you’ll have to get rid of all the removed stuff and port in one swing, true. I’d first try to isolate everything you can port to GL 3.0 core features, do that, and let the rest follow when you got the time.
In principle, forcing users to port to GL3/4 core contexts would give vendors the opportunity to provide drivers that only focus on the implementing the core profile and extensions and can their legacy code base. Oh well, we all know how it is instead.
You make it sound so easy but my guess is you never had a chance to look at the code involved when it comes to porting such old projects.
They normally come with a code base that has no optimization of rendering flow, takes liberal advantage of the freedom immediate mode gives and are nearly impossible to rewrite without addressing some fundamental design decisions first.
The code of the project I’m working on is inherently non-portable to core GL 3.x, it’d necessitate a complete rewrite of our data management.
I can, however port it to 4.x with persistently mapped buffers, but this can only be a gradual transition because the project is quite large. But with Intel not supporting a 4.x compatibility context I’m stuck in the situation that in order to keep it working all the old cruft needs to be retained throughout the entire transition but try telling that to a boss who needs to be sold on ‘more efficiency’. The answer I got was a straight ‘no’. Not worth the effort if we can’t clean up the code for months to come.
The old code, currently based at GL 3.0, is working fine, after all…
It still uses immediate mode, but thanks to GL 3.0 we were at least able to remove the fixed function code. We haven’t done matrices yet because it’s a waste of time unless we don’t get the big obstacle out of the way.
The main problem I am facing is that I can’t get rid of the immediate mode without using GL_ARB_buffer_storage’s persistently mapped buffers. Its plain and simply impossible to convert to core 3.x first and then upgrade.
And having to deal with hardware that does not allow both to coexist means I can’t do it gradually, it has to be done all at once. And that’s plain and simply impossible.
I have tried years ago to replace immediate mode with a glBuffer(Sub)Data based method. This translates badly. The buffer sizes are simply too small, the only way to make it work is to reorganize all the data to accumulate larger amounts, which is way beyond the scope anyone is willing to go. And constant mapping/unmapping is even worse in this particular case.
That’s why the project never went further than immediate mode 3.0, because anything beyond that simply was not doable efficiently.
With a persistently mapped buffer I can keep everything as it is and the code is actually faster, at least on NVidia. As for GL_ARB_buffer_storage, it is being supported by all recent drivers of the three major manufacturers. Fortunately one thing I don’t have to think about here is old hardware.
There is no direct link betwwen persistent buffer storage and immediate mode. Further more, persistent buffers are not the best ever possible solution for any problem programmers have with OpenGL. If data is dynamic, than it can bust performance, but if they are static, classical approach is much better. They gain no bust also for TF (tried in my application). Why would you use persistent buffer storage at all?
Of course that it is the worst possible solution to map directly immediate mode drawing (of few vertices) with VBOs. It is much slower than immediate mode because of significant overhead.
I also understand the significant amount of coding to optimize everything. If immediate mode suites you, just keep using it. I’m not sure for MacOS, but on Windows/Linux you should be able to use compatibility profile and easily mix ancient and modern approach, and slowly drift to better and more optimized solution. I’m not sure persistent buffer storage can aways beat immediate mode if direct mapping is used, but if you measured performance and concluded so than it is an excellent news.
And constant mapping/unmapping is even worse in this particular case.
Why? I’d assume an implementation to cache calls to immediate mode commands anyway and execute them in a more optimal manner regarding transfer over the bus at the time you call glEnd(). Why would mapping or updating via Buffer[Sub]Data() not work as well?
The buffer sizes are simply too small, the only way to make it work is to reorganize all the data to accumulate larger amounts, which is way beyond the scope anyone is willing to go.
How does GL_ARB_buffer_storage increase the amount of storage you can allocate for a buffer? I don’t get it. What you get with the extension are mechanisms to make buffer objects immutable (and prevent client-side but not server side updates), map a pointer to buffer storage persistently into client space and doing stuff like render-while-mapped (which was not possible otherwise), a vague sorta hint as to where a buffer objects data store is to be allocated (which, if you read the extension closely, you’ll find is pretty useless especially with UMAs like Intel hybrid graphics and stuff and is not even guaranteed to work other architectures as expected), and certain other stuff controlling mapped buffers and synchronization.
As it stands and to the best of my knowledge, implementations will not restrict how much stuff you upload to the GPU as long as there is memory. I did some tests a while ago simply creating 1024k sized VBOs and let the code run until I got around 8GB of system memory filled … while the implementation didn’t ever complain with an OUT_OF_MEMORY error. Memory allocation of buffer objects is completely nontransparent and GL_ARB_buffer_storage, although the attempt was made, did not solve this problem. As a developer you can only hope stuff is put where you expect it to be put.
That’s why the project never went further than immediate mode 3.0, because anything beyond that simply was not doable efficiently.
Maybe you should for ideas on that in a separate thread! We may be able to help, you know?
From what they (I don’t know whether Nikki is female or male here) told us so far, I assume they’re dealing with much ore than a few vertices. Maybe something like point clouds? That’s why I suggested they open another thread and specify the problem at hand.
I’m not sure for MacOS, but on Windows/Linux
Neither Apple nor Intel (on Linux) will provide ARB_compatibility when using context versions higher than 3.2 - with these vendors on these platform, you’re screwed if you wanna stick to legacy stuff - which I personally find very sexy. We wouldn’t be discussing a more than 20 year old API for new or refactored applications if ARB_compatibility never came up …
This is really coming across as what Stack Overflow would call “a rant disguised as a question”.
Is immediate mode currently causing you a problem? If the answer to that is “no” then congratulations! You can continue using immediate mode. Nobody is forcing you to upgrade your code. Stick with GL3.0 and use immediate mode safe in the knowledge that there is so much more legacy software out there using it that it’s not going to go away any time soon.
If the answer is “yes” then evaluate your code. Buffer objects are nothing new - they’ve been part of core OpenGL since version 1.5! So stop treating them as some kind of scary new functionality because they’re not. So: does your vertex data need to change from frame-to-frame? If not, you can put it in a static buffer object and just be done with it. Your glBegin/glEnd pairs (and everything between then) can each be converted to a single glDrawArrays and your job is done.
If your vertex data does need to change then evaluate the kind of changes it needs. Is this CPU-side code you can migrate to a vertex shader? A lot of what you think is dynamic data can actually be handled this way: frame interpolation, time-based stuff, etc. If it fits this description then you can still put it in a static vertex buffer and be done with it.
My experience is that the only real cases where vertex data needs to be absolutely dynamic are (1) a truly dynamic CPU-side particle system, or (2) 2D GUI code. Even in those cases you can still use glBegin/glEnd intelligently (i.e put them outside a loop rather than inside one) and still get high performance. Or you can use client-side arrays (also not new/scary/risky - GL 1.1 this time). Like I said at the start, nobody is forcing you to use buffer objects and nor is anybody forcing you to use the newer OpenGL. You’ve got an extremely rich set of options available, most of which are absolutely ubiquitously supported, so instead of complaining about problems how about you start thinking about solutions?
[QUOTE=mhagain;1259580]If your vertex data does need to change then evaluate the kind of changes it needs. Is this CPU-side code you can migrate to a vertex shader? A lot of what you think is dynamic data can actually be handled this way: frame interpolation, time-based stuff, etc. If it fits this description then you can still put it in a static vertex buffer and be done with it.
Great! Always the same suggestions that don’t help. I have said before that reorganizing the vertex data would be prohibitive due to the work involved. It simply can’t be done without rewriting half of the application’s rendering code and that’s completely out of the question. I do not want to do such a rewrite, it’d be months of work for no gain.
You always make it sound so simple, just rethink your approach and all will be fine. Well, in the real world it won’t be fine! In the real world you got to deal with huge crufty code bases that do not like being torn apart and being reassembled. The only chance you got is to go the path of least resistance, in this case it means not to touch the rendering logic itself, only change the means to get your data onto the CPU with as little change to the code as possible.
We put off the transition to core profile due to that - because it simply was too slow -, and now with persistently mapped buffers we finally have the chance - but what’s stopping us cold is the plain and simple fact that a lot of computers this needs to run on during the transition phase do not have a dedicated graphics card thanks to some bean counters thinking that the integrated chipsets have become good enough.
As I said earlier, nobody is forcing you to, you’ve said yourself that the current code is working fine, so why on earth would you rewrite it?
I’m totally failing to see what the problem you’re facing is here. You have a codebase that you say is woking fine, you have no pressing need to rewrite it, so not rewriting it is always an option. It will continue to work fine.
If you really really really want to rewrite it you do have other options. You can use client-side vertex arrays for dynamic objects, you can gradually transition without disrupting your codebase too much, and slowly put yourself in a position where you can make the jump to 4.x, but the important thing is that not rewriting it at all is also an option.
That’s why I also said “rant disguised as a question” above, you know. Because you seem to be ignoring the fact that you don’t have to rewrite this code, and focussing instead on complaints and negatives. You don’t have to rewrite, so don’t.