Currently I’m find myself wanting to know what I can expect when allocation memory for buffers and textures. What will go into VRAM? What and when will be put into system memory? How do implementations handle swapping data between the two? How efficient is that swapping mechanism for a particular implementation? I know: the basic question has been asked many, many times and the aswers are usually either “you can’t get info on what memory and how much of it is currently in use by the implementation” or “if you have an AMD/an NVidia GPU you can use their proprietary extension to a certain degree.” So far, so good.
What I’d like to know is why exactly, optimally including some driver dev insight, this seems to be next to impossible to standardize. As of yet, we got two propr. extensions, namely GL_NVX_gpu_memory_info and GL_ATI_meminfo. Obviously vendors at least understand the desire for wanting to able to query information about memory usage. I can understand that different hardware does things differently. I can understand that different platforms do things differently. What I cannnot understand is why implementation, that have to use the appropriate info anyway to do all the memory management magic under the covers, can’t allow the developer to query that info in a uniform manner. The only piece of information we get is GL_OUT_OF_MEMORY - and we don’t necessarily get that when there’s no VRAM left - we get this error when the implementation cannot get any memory - system memory for the current process included. Case in point, I can easily allocate 8GB (I stopped at this size) of VBO memory on an HD7970 with 3GB of VRAM in chunks of 1MB per VBO on Linux.
One could simply rely on the implementation to do the right thing and handle memory transfers according to the usage at runtime. Fine, I can live with that in principle. But what if I don’t want to? Or at the very least, if I want to know what the driver does under the hood so I can estimate how to choose my usage patterns accordingly? Especially in volume rendering, 1-3 GB of high-speed memory is nothing unless you employ compression algorithms like COVRA. In my daily business, 16GB of volume data is considered standard so streaming will most likely occur even with compression, unless one is able or willing to sacrifice an unreasonable amout of precision. In that case, knowing the performance characteristics of what the driver does internally is crucial.
Another thing: If it’s possible to get some memory info in a cross-platform way in OpenCL (even cache and cache line sizes), why not in OpenGL?
I guess this all boils down to the following questions:
[li]what differences in hardware and OSs prevent a standardized API from being conceived?[/li][li]if we already have two proprietary extensions, is determining a least common denominator really impossible?[/li][li]is it possible to get some more info on what implementations do when eviction occurs? maybe get some performance info through ARB_debug_output/KHR_debug?[/li][li]what about performance counters? at least AMD’s GPUPerfAPI provides a host of memory related counters.[/li][/ul]