So I was looking through some nvidia presentation slides, and in this document:
on slide 52, it states that the order of GPU allocation will affect performance; specifically that render targets, shaders, and textures should all be allocated in that order for best results.
I was wondering if somebody could shed some light on the reasoning behind this (not just for nvidia cards necessarily, but any architecture). Personally I would guess it has something to do with the efficiency of paging stuff to and from video memory, and that as a result it would only be helpful while paging events were occuring. I’m guessing the actual vram address of a given resource doesn’t matter - all vram is created equal, no?
Right now I’m working on an app where plenty of swapping is taking place as the user moves around the world, so any info even remotely related to this topic would be appreciated. Thanks!
I’ve seen words to that effect elsewhere, as well. I think the idea is that you want to be sure that RTs and shaders are always in local video memory, no matter what. Textures tend to be swapped in and out, so that’s generally not as big a concern so they come last. I guess you could look at it as a priority scheme, where the priority is implicit in the order of allocation. (shrug)
There’s the so-called UMA (unified memory architecture), which supposedly makes the driver responsible for the management of all resources, based completely in and paged from system memory as needed. That’s a driver dependent feature, so I’m not sure what to make of it. Not really sure why I mentioned it, except that I was puzzled by it.
Personally, I find all this troubling and I sincerely hope the day soon arrives that we needn’t worry about such things as the order of resource allocation. Although in my situation, it’s pretty straightforward and actually quite natural to allocate in that order, besides, since my RTs and shaders tend to be relatively permanent fixtures.
I found out about this the hard way, when I noticed that changing resolutions on the fly (with the implicit size change in render targets) sometimes seriously hurt performance.
It’s annoying to delete and recreate all of my GL objects for a simple window resize.
It’s basically a question of fragmentation.
If you allocate static stuff first (stuff that doesn’t get swapped out), then when textures fragment memory, they can do so freely without interference from static blocks. If you allocate static blocks in the middle of a fragmented memory space, then there can be problems.
Thanks for the info guys. I’ve begun to re-work some of my loading code to reorder things and I’ve seen the framerate change for the better in some cases and for the worse in others. I’m still not done yet, so I guess the jury is still out. My ordering is going to be something like:
- Render Targets
- Vertex Buffers
The vertex buffers make me a bit nervous; if I were writing drivers for these things I might page off the vertex buffers in deference to textures, or even go so far as to draw them directly from memory off the card if there wasn’t room. Has anyone come accross some more info regarding where VBOs should be in the allocation order?
I think korval has it, beware of fragmentation, you should put the most dynamic stuff last, and hope static suff is small enough to stay in vram.
It has even been suggested that a total eviction of resources can help in the fragmentation department (like between levels of a game, or similar working set changes).
All the swapping is likely to be based on a driver dependent eviction strategy, probably time-stamp/LRU based, so I might give textures and VBOs even odds. Surely RTs and shaders have an intrinsically higher priority than textures and VBOs; that’s about as far as I’m willing to venture.