-What will happen if I upload textures to the VRAM beyond the remaining space ?
Like uploading height 2048*2048BGRA8 textures on a 128MB GPU: what will happen on 7th/8th texture uploaded ? Will it “virtualize” memory, swapping VRAM and RAM when needed, like Windows virtualize RAM using HardDrive when needed ?
-Is there a way to know what space is still available in VRAM ?
-Is there a tool/database to know the capacities (functions, limitations) of GPUs ? I used to use GLInfo, but it has been discontinued.
Vista had hinted that capability exists, and most gpus probably have TurboCache-like interface. So, I did a test under OpenGL3.2 with GF8600GT and WinXP SP2: loaded 640MB texture data, and the pixels were all there - just that the framerate fell from 100 to 84 (10ms to 12ms) : PCIe was getting loaded.
So, yes - VRAM is getting virtualized at least on G80+ cards. Can’t test it on a Radeon HDxxxx right now.
Pretty sure it does this since the AGP days. Slow like hell at that time, at least 10 times slower.
Ah, awesome ! So I don’t have to write a streaming engine (I’m not coding a game, so automatic swapping would be good enough).
How can I be sure all GPUs have this function ? Is the official name for it “TurboCache” ?
Did some more testing, as what V-Man replied on gamedev.net forums simply can’t run at 84-204 fps in the test-case I mentioned.
Removed MSAA from my benchmarking app, and the numbers are (you convert to the more-relevant milliseconds ):
Btw filtering is trilinear, each tex has mipmaps. Viewport is 1280x720
case 1: real level + a lot of cpu work done (computing a skinned mesh of 40k vertices, for research) :
Small textures: 209 fps (all textures are visible onscreen btw)
900MB textures: 109-190fps
900MB textures, including a z-pass to remove overdraw: 140-204fps .
case 2: real level only.
Small textures: 740-1310fps
900MB textures: 190-400-1200fps (depending on views: zoomed on complex scene/ regular-view/bird’s eye)
900MB textures with a z-pass : 300-450-1100fps
But in my case, my RAM and cpu are heavy-hitters: c2d E8500@3.8GHz, dual-channel DDR3@1.6GHz (clk 7).
If OpenGL was uploading/replacing whole textures on the gpu, it would have required over 1TB/s bandwidth from PCIe (whereas my PCIe 1.0 card can only give 4GB/s). Thus, definitely the gpu fetches individual texels (or blocks of texels) on its own via DMA
excellent, thanks for the numbers !