First glTexSubImage2D on a New Texture is Slow

Hey all,

I’m having some issues with asynchronous upload from PBO to a texture. Everything works but I noticed the very first glTexSubImage2D call on a new texture takes ~15ms, whereas each subsequent one after that takes ~0.03ms.

In my test program the code flow is:

  • Create a 1024x1024 RGBA8 texture
  • Create a PBO with size 64x64x4
  • Upload a flat solid colour to the PBO, which will be used to fill the texture
  • Keep running the program for 5 seconds to give the driver time to copy memory if need be
  • Start copying data from the PBO to the Texture
uint textureHandle;
uint pboHandle;
const uint pboSize = 64 * 64 * 4;
byte[] uploadData = new byte[pboSize];

unsafe void InitialiseTexture()
{
    // Create the texture
    textureHandle = Gl.GenTexture();
    Gl.BindTexture(TextureTarget.Texture2d, textureHandle);
    Gl.TexImage2D(TextureTarget.Texture2d, 0, InternalFormat.Rgba8, 1024, 1024, 0, PixelFormat.Rgba, PixelType.UnsignedByte, IntPtr.Zero);

    // Create the PBO
    pboHandle = Gl.GenBuffer();
    Gl.BindBuffer(BufferTarget.PixelUnpackBuffer, pboHandle);
    Gl.BufferData(BufferTarget.PixelUnpackBuffer, pboSize, IntPtr.Zero, BufferUsage.DynamicDraw);

    // Upload data to the PBO
    Gl.BindBuffer(BufferTarget.PixelUnpackBuffer, pboHandle);
    var destPtr = Gl.MapBufferRange(BufferTarget.PixelUnpackBuffer, IntPtr.Zero, pboSize, Gl.MAP_WRITE_BIT);

    fixed (byte* src = uploadData)
    {
        var srcPtr = (IntPtr)src;
        CopyMemory(srcPtr, destPtr, pboSize);
    }

    Gl.UnmapBuffer(BufferTarget.PixelUnpackBuffer);
}

int uploadX = 0;
int uploadY = 0;

// The program runs for 5 seconds before we start copying from the PBO to the Texture
void CopyFromPBOToTexture()
{
    Gl.BindTexture(TextureTarget.Texture2d, textureHandle);
    Gl.BindBuffer(BufferTarget.PixelUnpackBuffer, pboHandle);

    // Upload a small 64x64 section of the texture from the PBO to the texture
    Stopwatch sw = Stopwatch.StartNew();
    Gl.TexSubImage2D(TextureTarget.Texture2d, 0, uploadX * 64, uploadY * 64, 64, 64, PixelFormat.Rgba, PixelType.UnsignedByte, IntPtr.Zero);

    // Log the call time
    sw.Stop();
    Console.WriteLine($"[glTexSubImage2D] [X: {uploadX}] [Y: {uploadY}] {sw.Elapsed.Ticks / 10000.0}ms")

    Gl.BindBuffer(BufferTarget.PixelUnpackBuffer, 0);
    Gl.BindTexture(TextureTarget.Texture2d, 0);

    // uploadX and uploadY keep track of which region of the texture to upload to next
    uploadX++;

    // When an entire row is uploaded, move onto the next row
    if (uploadX >= 1024 / 64)
    {
        uploadY++;
        uploadX = 0;
    }

    // If all rows have been uploaded
    if (uploadY >= 1024 / 64)
    {
        // Texture upload complete
    }
}

The performance logging shows:

[glTexSubImage2D] [X: 0] [Y: 0] 14.1923ms
[glTexSubImage2D] [X: 1] [Y: 0] 0.0196ms
[glTexSubImage2D] [X: 2] [Y: 0] 0.0327ms
[glTexSubImage2D] [X: 3] [Y: 0] 0.0137ms
[glTexSubImage2D] [X: 4] [Y: 0] 0.0443ms
[glTexSubImage2D] [X: 5] [Y: 0] 0.0382ms
[glTexSubImage2D] [X: 6] [Y: 0] 0.0291ms

I am on a NVIDIA GTX 960 with the latest drivers. Only one texture is updated at any one time, and they are not used for rendering until they are completely uploaded.

Does anyone have an insight into what is causing this?