Bandwidth optimization

Originally posted by Korval:
I’ve never heard anything about this before. There is a good reason to interleve based on linearity of data (and that DRAM likes linear reads), but I’d never heard anything about a numeric limit on the number of VBO’s to use.
Neither have I, but then I didn’t ask.
I still believe it is conceivable that a memory controller can have such a limitation. Eg AMD disclosed that the K7 prefetching logic can have a maximum of six transactions (=cache lines) in flight.
I realize this example is not at the actual memory controller level.

You guys really should get out more :slight_smile:

If you read the DirectX 8 and 9 documentation on vertex streams (which are very similar to OpenGL attribute pointers) then you notice that there is a “cap” for how many attribute streams are allowed. A “pure” device for some of the current popular high-end cards returns “2” for this value. The explanation is that different cards have different hardware capabilities, although I expect that for some cards, this number may be a driver limitation rather than a hardware limitation. I e YMMV :slight_smile:

Anyway, memory linearity is your friend. So interleave where you can.

If you read the DirectX 8 and 9 documentation

The last time I attempted to attribute something in DirectX to hardware limitations (namely, that you should go to great lengths to send as large of batches as possible to the GPU, using as few D3D DrawPrimitive/GL glDraw* calls as possible), Cass told me a very different story. As such, you should take with a grain of salt which limitations are imposed by the API and which ones are actually hardware.

You could profile it. However, there’s no way to tell the difference between this factor and DRAM random-access latency, so a drop in performance can’t be directly attributed to this factor.