FBO-> PBO is slow...

Hello all…
Good to see everyone … ^0^

Recently I tested the program, and the results are shown as below…

(1) copy Image from FBO to OpenGL PBO
( glReadPixels(0) ) => costs 4.2(ms)

(2) copy Image from FBO to System memory
(glReadPixels(System memory) ) => costs 4.9(ms)


Originally I think GPU_buffer->GPU_buffer is faster than GPU_buffer->CPU_buffer…
But…it seems the same …
It’s very strange , doesn’t it~?


Is the pixel format/type that you request with glReadPixels() the same as that of the FBO? Are you doing your time measurement correctly?

Copying data from FBO->PBO steps I followed are shown as below…

  1. Create Buffer()
  2. Bind Buffer()
  3. glDrawBuffer(GL_BACK)
  4. renderScene()
  5. glReadPixels(0, 0, width, height,GL_RGB,GL_UNSIGNED_BYTE,
    BUFFER_OFFSET(0));
  6. Bind Buffer(0)

I didn’t set any paraneters of FBO…
I just created a buffer and copied the back buffer data of FBO back…

and… measuring time method I used is shown as below…

#include “windows.h”

LARGE_INTEGER m_nFreq ;
LARGE_INTEGER nt0,nt1 ;

QueryPerformanceFrequency(&m_nFreq);
QueryPerformanceCounter(&nt0);

Test_func() ;  

QueryPerformanceCounter(&nt1);

printf("Test_func : %f ms.
",(float)((nt7.QuadPart-nt6.QuadPart)*1000/(double)m_nFreq.QuadPart ));


You cant measure timings that way. All gl commands are buffered and executed when GPU have time. Only readback operation cause pipeline flush before it returns to caller.

Can you post more informations about your problme… like code sample, hardware, OS, driver version, etc.

GL_RGB

that should be your problem. 24bit formats are generally not hw-accellerated, probably leading you into a software-fallback path. Try GL_RGBA or GL_BGRA and see if its faster.

Thanks your reply!! yooyo!! ^0^
My program and computer spec are shown as below…


           My program

int PBO_datasize = width * height * 3 * sizeof(unsigned char) ;

//Create PBO
GLuint IMG_Buffer ;
glGenBuffers (1, &IMG_Buffer);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, IMG_Buffer);
glBufferData(GL_PIXEL_UNPACK_BUFFER, PBO_datasize,NULL, GL_STREAM_DRAW_ARB);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);

//Using Texture to render the scene
glDrawBuffer(GL_BACK);
.
.
render_scene() ;
.
.

// Get Img From FBO
QueryPerformanceCounter(&nt8 );

glBindBuffer(GL_PIXEL_PACK_BUFFER, IMG_Buffer);
glReadPixels(0, 0, width, height, GL_RGB,GL_UNSIGNED_BYTE, 0);
glBindBuffer(GL_PIXEL_PACK_BUFFER, 0);

QueryPerformanceCounter(&nt9 );
printf("FBO->PBO : %f ms.
",(float)((nt9.QuadPart-nt8.QuadPart)*1000/(double)m_nFreq.QuadPart ));

//Unbind
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);


          *** My computer ***
  1.  CPU     : Intel Core 2 Duo E8400, 3000 MHz
    
  2. Mother Board : ASUS P5Q Pro
  3.  GPU     : NVIDIA GTX 280
    
  4. GPU driver : 6.14.11.7781
  5.  RAM     : Kingston DDR2 800 1G * 4
    
  6.  OS      : Windows XP

Wow…It sounds great…
I will try it soon and report my test result…!
Thanks for your reply…skynet… ^0^

Use GL_BGRA or GL_BGR instead of GL_RGB.

I tried both of them…It seems to be not faster…
It still needs about 4~5 (ms)