Thank you Alfonse for the thorough explanation. I don’t consider myself hopelessly uneducated in the matter, so I will not take the offense. I have had bits of graphics courses at university (with OpenGL and DX) and have read dozens of samples, and have had to code smaller applications myself too.
Coming to graphics from the GPGPU side, specially OpenCL for that matter, this globally visible context that only shares close relation to the OS device context is a really troubling concept. OpenCL is a lot more flexible and powerful in that sense. CommandQueuues serve as a really good means of synchronizing your code as well as feeding devices with kernels. In fact, with interop applications I will definately integrate OpenGL routines into this mechanism via clEnqueueMarker and event callback functions that will execute drawing in an asynchronous manner. (Yet again there in an obscure, globally visible commandqueue related to a DC that can be Flushed and Finished, but that’s about it) Out-of-order queues are even better, coming close to black magic if used properly.
Anyhow, as you have said, OpenGL is a bad C API, while OpenCL is visibly more flexible. (I know this comes from the fact that OpenGL has been evolving, sometimes having to keep backward compatibility, whereas OpenCL mainly wishes to mimic CUDA with a somewhat more flexible host side API) Having that said, would it really be the work of the devil to suggest for OpenGL to expose more of this type of functionality as part of the API (and do some refactoring also)? It is quite a pain that querying interoperable devices is 40-50 lines minimum of platform specific code messing around with window handles, pixel setups and device contexts. I know these have to be dealt with sometime during writing a window application, but it would be a lot more elegant, if the gl_context type existed that can be cast to HDC or whatever OS specific type.
OpenCL 1.2 brings the accessibility of built-in native functions, such as fixed function H.264 encoders for example. This allows simulation output (forgive me for the example, I am a physicist) to be rendered directly as a movie, allowing for offline analysis and saving HEAPS of HDD space. Also, if render is not fast enough to be interactive, this could serve as a last resort. (Not to mention games could capture movies without interfering with actual gameplay) Now, wouldn’t it be nice to render to texture and encode to a movie without any windowing system’s intervention? Why do I have to ask for a DC from the OS? (Because OpenGL doesn’t have it’s own.)
About the phylosophical differences… actual use of VAOs came across my path just last week, but yes this is only a drop in the ocean of state modifying calls.
I am well aware that all of this is a dream, but what really surprises me, is that nobody says that: yes, it would be better. The only one statement concerning this part was that ARB dev time can be put to better use. I am no expert, so I don’t know how painful ARB extensions are, so I’ll believe whatever you say about this.