Writing to AGP Memory using VAR

Simple question - probably been asked before, but can’t find a positive answer.
I upload my vertex array to agp memory. Every frame, I want to change just the ‘y’ value for every vertex.
Is it best for me to keep my own copy of the vertex array in system memory, change the ‘y’ values in that array, then (after my fence has finished) do a ‘memcpy’ to copy the entire system memory vertex array onto the agp memory vertex array? Or should I stick to my loop (below):-

float* v=agp_varray + YOFFSET;
float* h=height_array;
for (i=0; i<numverts; i++)
*v = *h++;

Thanks for listening, and sorry if you’ve heard this one before.

[This message has been edited by knackered (edited 01-15-2002).]


Technically you should update you system memory shadow, and then blit it across doing sequential writes to take account of write combining.

However, I tried doing a particle test in AGP RAM, and found it was much quicker to just loop through the AGP memory, modifying vertices as I went. Hardly optimal, but turned out faster for some reason.

As a side note, dont use memcpy for copying arrays to AGP, as it isn’t very efficient. A simple few lines of assembler using rep movsd would probably be better. Or even a loop in C, with a int pointer, copying Dwords at a time.


P.S. If you want the fastest solution. Do every method you can think of, and time them. IT’s the only true way to know which is the best.

[This message has been edited by Nutty (edited 01-15-2002).]

The memcpy() that comes with MSVC version 6 uses rep movsd internally, so it’s about as efficient as the assembly that you’re talking about. If you want sufficiently better speed, you’d have to code a more advanced loop that pre-reads large chunks into the L1 cache (to keep the DRAM page open) and then copy from there into the AGP buffer using aligned uncached writes (typically using movntps, or possibly movntq for more AMD CPU compatibility).

And modifying your system memory buffer and blitting to AGP is definitely the right thing to do; especially if you modify the system RAM by doing buffer pre-read before modifying (again, because of the cache).

If you’re not going this extra distance, chances are either way you write it, it’s going to come out a wash.