PBO/Textures/Alignment NVIDIA Bug?

We are having problems initializing a texture from a PBO using NVIDIA cards. Each row of the data in the PBO is 4-byte aligned and from a Microsoft bitmap. For example, here is a 1 x 2, 4-byte aligned, RGB image.

1, 2, 3, 255
4, 5, 6, 255

(The 255 values the padding bytes required for the 4-byte alignment, not alpha values.)

If we load those 8 bytes into a PBO and then create an 1 x 2 RGB texture from that using an unpack alignment of 4, the texture is loaded incorrectly with

1, 2, 3,
4, 5, 0

If I do the same thing using system memory as the source, rather than PBO, the texture is loaded with

1, 2, 3
4, 5, 6

as expected.

This is an issue with our laptop (260m) and desktop (8400 GS) using the latest NVIDIA drivers from their site. Our ATI card works correctly using either the PBO or system memory.

Below is a chunk of code that reproduces the issue.


//
// Data
//
const int texWidth = 1;
const int texHeight = 2;
const int texsize = texHeight * ((texWidth * 3) + (4 - 
    ((texWidth * 3) % 4)));
unsigned char inData[] =
{
    1, 2, 3, 254,
    4, 5, 6, 253
};

//
// Create PBO
//
GLuint pboName;
glGenBuffers(1, &pboName);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, pboName);
glBufferData(GL_PIXEL_UNPACK_BUFFER_ARB, texsize, inData, GL_STATIC_DRAW);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);

//
// Create texture
//
GLuint textureName;
glGenTextures(1, &textureName);
glBindTexture(GL_TEXTURE_2D, textureName);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB8, texWidth, texHeight, 0,
    GL_RGB, GL_UNSIGNED_BYTE, 0);

//
// Write system memory to texture
//
glPixelStorei(GL_UNPACK_ALIGNMENT, 4);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, texWidth, texHeight,
    GL_RGB, GL_UNSIGNED_BYTE, inData);

//
// Read the texture
//
const int limit = 3 * texWidth * texHeight;
unsigned char outData[limit];
for (int i = 0; i < limit; ++i)
{
    outData[i] = 0;
}
glPixelStorei(GL_PACK_ALIGNMENT, 1);
glGetTexImage(GL_TEXTURE_2D, 0, GL_RGB, GL_UNSIGNED_BYTE, outData);

//
// Print
//
printf("                     input: %d %d %d %d %d %d
",
    inData[0], inData[1], inData[2], inData[4], inData[5],
    inData[6]);
printf("texture from system memory: %d %d %d %d %d %d
",
    outData[0], outData[1], outData[2], outData[3], outData[4],
    outData[5]);

//
// Write PBO to texture
//
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, pboName);
glPixelStorei(GL_UNPACK_ALIGNMENT, 4);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, texWidth, texHeight,
    GL_RGB, GL_UNSIGNED_BYTE, BUFFER_OFFSET(0));
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);

//
// Read the texture
//
for (int i = 0; i < limit; ++i)
{
    outData[i] = 0;
}
glPixelStorei(GL_PACK_ALIGNMENT, 1);
glGetTexImage(GL_TEXTURE_2D, 0, GL_RGB, GL_UNSIGNED_BYTE, outData);

//
// Print
//
printf("          texture from PBO: %d %d %d %d %d %d
",
    outData[0], outData[1], outData[2], outData[3], outData[4],
    outData[5]);

After some experimentation, it appears that when NVIDIA’s code reads the PBO data, it includes the padding bytes when counting how much data it has read in. For example, if we had a 1 x 3, RGB image,

1, 2, 3, 255
4, 5, 6, 255
7, 8, 9, 255

NVIDIA will copy 9 bytes, but include the 2 padding values in the count and therefore copy only 7 actual values over. The texture will have

1, 2, 3
4, 5, 6
7, X, X

The X values I show in the texture are uninitialized values.

I’m pretty sure that this is a bug, but it could be that our code is wrong. Thanks for any help.

Here’s some extra driver information:

  • The desktop with the GeForce 8400 GS has drivers 197.45.
  • The laptop with the GeForce 260M has drivers 195.62.

The code works on an ATI Radeon 5870 with Catalyst 10.4.

Patrick