Pixel Buffer Objects

yossi · February 7, 2005, 2:53am

Hello,
PBO’s are a new issue for me.
Here are a few questions refering the example found in the spec:

Streaming textures using pixel buffer objects:

    const int texWidth = 256;
    const int texHeight = 256;
    const int texsize = texWidth * texHeight * 4;
    void *pboMemory, *texData;

    // Define texture level zero (without an image); notice the
    // explicit bind to the zero pixel unpack buffer object so that
    // pass NULL for the image data leaves the texture image
    // unspecified.
    glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, texWidth, texHeight, 0,
                 GL_BGRA, GL_UNSIGNED_BYTE, NULL);

    // Create and bind texture image buffer object
    glGenBuffers(1, &texBuffer);
    glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, texBuffer);

    // Setup texture environment
    ...

    texData = getNextImage();

    while (texData) {

        // Reset the contents of the texSize-sized buffer object
        glBufferData(GL_PIXEL_UNPACK_BUFFER_ARB, texSize, NULL,
                     GL_STREAM_DRAW);

        // Map the texture image buffer (the contents of which
        // are undefined due to the previous glBufferData)
        pboMemory = glMapBuffer(GL_PIXEL_UNPACK_BUFFER_ARB,
                                GL_WRITE_ONLY);

        // Modify (sub-)buffer data
        memcpy(pboMemory, texData, texsize);

        // Unmap the texture image buffer
        glUnmapBuffer(GL_PIXEL_UNPACK_BUFFER_ARB);

        // Update (sub-)teximage from texture image buffer
        glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, texWidth, texHeight,
                        GL_BGRA, GL_UNSIGNED_BYTE, BUFFER_OFFSET(0));

        // Draw textured geometry
        glBegin(GL_QUADS);
        ...
        glEnd();

        texData = getNextImage();
    }

    glBindBuffer(GL_PIXEL_UNPACK_BUFFER_ARB, 0);

Questions:

I see two data transfers in the code:
One - memcpy from user array to PBO.
Two - from PBO (using glTexSubImage2D) to the actual memory from which the redering occurs.
Am I right?
If yes - whats the benifit in streaming video app (still two data transfers for each new video frame)?
Is there a way to ‘inject’ data directly to the texture?
In VBO’s, mapping gives you access to the vertex buffer and only one data transfer is needed before rendering.
I thought this is the same with PBO (?)

Many thanks,
Yossi

Korval · February 7, 2005, 9:52am

// Map the texture image buffer (the contents of which
// are undefined due to the previous glBufferData)
pboMemory = glMapBuffer(GL_PIXEL_UNPACK_BUFFER_ARB,
GL_WRITE_ONLY);

// Modify (sub-)buffer data
memcpy(pboMemory, texData, texsize);

// Unmap the texture image buffer
glUnmapBuffer(GL_PIXEL_UNPACK_BUFFER_ARB);
If that’s all you’re doing, don’t bother mapping; just use BufferSubData.

Mapping is useful when you’re generating data, not if you already have it stored in an array.

If yes - whats the benifit in streaming video app (still two data transfers for each new video frame)?
Well, first, if you’re streaming, you should be generating your data directly into the PBO, rather than storing it in your own internal array.

Second, the advantage is that glTexSubImage2D happens (relatively) asynchronously.

Is there a way to ‘inject’ data directly to the texture?
No.

In VBO’s, mapping gives you access to the vertex buffer and only one data transfer is needed before rendering.
I thought this is the same with PBO (?)

It does. However, a PBO doesn’t represent a texture or other kind of image. PBO exists to allow pixel transfers to be asyncronous.

inet · September 24, 2006, 11:28am

Question 1:

Could you explain further in details what the glTexSubImage2D does?

Say, if I call

glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, texWidth, texHeight,
GL_BGRA, GL_UNSIGNED_BYTE, texData);

doesn’t the driver copy the content of texData directly from system memory to on-board texture memory?
Or the driver does the copy in the following path:
system memory -> AGP memory --> on-board graphics memory.

Question 2

Related to Quest 1 and it’s about the ogl specs.

begin quote from ogl spec

Streaming textures. If the application uses MapBuffer/UnmapBuffer
to write its data for TexSubImage into a buffer object, at least
one of the data copies usually required to download a texture can
be eliminated, significantly increasing texture download
performance.

end quote

What does it exactly mean that “one of the copies usually required to download a texture can be eliminated” ?
As Korval has pointed out, “PBO exists to allow pixel transfers to be asyncronous”. So my understanding is that the PBO hasn’t eliminated a copy, but only allows async copy.

Brolingstanz · September 24, 2006, 1:40pm

Wow, it seems like this question has come up a lot lately. What is so difficult to understand?

Map a block of driver controlled memory, write to it, then unmap it so the driver can schedule an asynchronous DMA transfer. Obviously if you don’t have much to do in the interim, there’s not a lot to be gained by this.

So my understanding is that the PBO hasn’t eliminated a copy, but only allows async copy.
How do you figure that? Iff you haven’t written to the driver controlled DMA memory, then you are going to need to copy there first, right?

inet · September 25, 2006, 8:59am

Thanks for the answer. However, It’s still not 100%
clear to me.

Let me ask Question 2 again this way:

If I use glTexSubImage2D without PBO, there is only one copy operation. However, you have to wait till the texture has completely been downloaded to the graphics board.

If I use glTexSubImage2D with PBO, I still need one copy operation from system memory to driver-controlled memory. Then the actual texture transfer can be performed asych.

So both methods need one copy operation. The difference is only sync or async.

Korval · September 25, 2006, 9:49am

So both methods need one copy operation. The difference is only sync or async.
Yes. Except that you must be careful to write correctly to the driver controlled memory; you can’t/shouldn’t use it like regular memory where you random access and so forth. Never read from it, and only write sequentially.