PBuffer Performance

azcoder · September 9, 2003, 3:16pm

I am trying to develop a small sample of hardware Shadow Mapping. Rendering my scene normally at 1024x1024 gives me ~100fps. If I just activate a 1024x1024 pbuffer, then glClear(Color|depth), and deactivate the pbuffer once per frame then my fps drops below ~50fps.

Is this type of performance hit normal for using a single pbuffer?

Note - Right now this is in Linux with the nvidia drivers on a GeForce Go 4200- I am porting to Win32 Ti4600 right now to see if I get the same performance hit.

Thanks in advance for any help

rgpc · September 9, 2003, 5:17pm

A drop in performance is not uncommon but yours is a little extreme.

Can you post the code you use to Active the buffer (switch contexts), clear etc?

Of course it all depends on your port to Win32 as it may simply be a driver issue…

yannoo · September 9, 2003, 5:50pm

If you have 100 fps with ** ONE ** frame, it can be logic that you have only 100/2=50 fps for ** TWO ** frames (the real color/depth/stencil buffer[s], plus the pbuffer) …

@+
Cyclone

azcoder · September 10, 2003, 2:39pm

Well I ported to Win32 on the same machine, same card. Now I get ~111 fps normally and it only drops to ~90fps if I activate/clear/deativate the pbuffer.

Here is some psuedo code for the activation:
m_hOldGLRC = wglGetCurrentContext();
m_hOldDC = wglGetCurrentDC();
wglMakeCurrent(m_hDC, m_hGLRC);

Then: glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

Finally to deactivate:
wglMakeCurrent(m_hOldDC, m_hOldGLRC);

Then I render the scene normally.

I am starting to think the bulk of the issue may be linux driver related. I am going to make sure I update to the latest Linux drivers and cross my fingers…

Thanks in advance for any advice or insight…

zeckensack · September 10, 2003, 5:24pm

Sounds like vsync to me.

azcoder · September 11, 2003, 6:32am

Latest drivers same problem…

vsync good point - Let me double-check…

azcoder · September 11, 2003, 8:27am

Well I enabled __GL_SYNC_TO_VBLANK and the normal fps drops to ~60, and with pbuffer clearing it stays at ~50.

I am requesting an rgb(8/8/8) pbuffer with 24-bit depth.

Does anyone have any other ideas - I am posting to the NV News Nvidia Linux forum…

Thanks for any help

roffe · September 11, 2003, 2:39pm

Two things come to mind when using pbuffers. First, does your pbuffer and the window framebuffer have the same pixelformat? If so, try using the same gl context. Might speed up context switching. Second, you wouldn’t happen to do a glCopyTex[Sub]Image from the pbuffer would you? If you do, make sure that the pbuffer and the texture have the exact same internal format.

rgpc · September 11, 2003, 4:32pm

Originally posted by cyclone:
[b]If you have 100 fps with ** ONE ** frame, it can be logic that you have only 100/2=50 fps for ** TWO ** frames (the real color/depth/stencil buffer[s], plus the pbuffer) …

[/b]

You are assuming that they are both the same size (& format), that you are doing exactly the same thing in both, and that pbuffers operate at exactly the same rate as the windowed buffer.

Originally posted by zeckensack
Sounds like vsync to me

Except for the fact that it involves pbuffers - which aren’t affected by vsync (unless you call wglSwapBuffers twice).

It looks like a driver issue to me (Your drop in performance for the Win32 port looks “reasonable”), but roffe’s suggestion is worth looking at. One other question is if it’s possible to not clear the color buffer - ie. Just clear the depth. See if that buys you some performance). It’s probably worth removing your clear to see if the problem stems from the context switch or the clear (I’d guess the context switch - but it’s worth looking at any rate).

azcoder · September 12, 2003, 7:53am

Thanks rgpc - I’ve done some further testing in Linux:

Normal(no pbuffer use) ~100fps
Pbuffer Activate/Deactivate Only ~72fps
Activate/Clear Depth/Deactivate ~65
Activate/Clear Depth+Color/Deactivate ~50 fps

Roffe - I was planning to start using glTexSubImage2D next, and I will check the pixel format.

I was also going to try on a different card other to see if I get the same slow down.

Thanks again for your help and any other ideas/insights

azcoder · September 12, 2003, 9:08am

Well, after tabling the performance issue for the moment, I’ve moved on to the next step and I’m having the following issue:

If I create a color texture and copy the pbuffer to it with glTexSubImage2D, I can render it on a quad no problem. It shows as a color image of the view from the light.

But if I create a depth texture with:
glGenTextures(1,&depth_map);
glBindTexture(GL_TEXTURE_2D,depth_map);
glTexImage2D(GL_TEXTURE_2D,0,depth_format,TEX_SIZE,TEX_SIZE,0,GL_DEPTH_COMPONENT,GL_UNSIGNED_INT,0);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_MODE_ARB, GL_COMPARE_R_TO_TEXTURE_ARB);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_FUNC_ARB,GL_LEQUAL);

and copy the pbuffer to it with:
glBindTexture(GL_TEXTURE_2D,depth_map);
glCopyTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 0, 0, TEX_SIZE,TEX_SIZE);

When I render the quad with:
glBindTexture(GL_TEXTURE_2D,depth_map);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_MODE_ARB, GL_NONE);
glTexParameteri(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE_EXT);
glTexParameteri(GL_TEXTURE_ENV, GL_COMBINE_RGB_EXT, GL_REPLACE);
glTexParameteri(GL_TEXTURE_ENV, GL_SOURCE0_RGB_EXT, GL_TEXTURE);
glTexParameteri(GL_TEXTURE_ENV, GL_OPERAND0_RGB_EXT, GL_SRC_COLOR);
glBegin(GL_QUADS);
// Start drawing the map as a textured QUAD
//Assign the texture coordinates and vertices
glTexCoord2f(1.0f, 0.0f); glVertex3f(x + width, y, 0);
glTexCoord2f(1.0f, 1.0f); glVertex3f(x + width, y + height, 0);
glTexCoord2f(0.0f, 1.0f); glVertex3f(x, y + height, 0);
glTexCoord2f(0.0f, 0.0f); glVertex3f(x, y, 0);
glEnd();
SRC_ALPHA);

All I get is a white quad.

Any ideas?
Thanks in advance for any help/insight

[This message has been edited by azcoder (edited 09-12-2003).]

rgpc · September 12, 2003, 4:44pm

Are you using glGetError() at all? If not stick one in after you do your bind and/or copy and see if an error code returns - that’ll point you in the right direction at least.

Also, have you called wglShareLists() (or rather the Linux equivalent of it - if there is an equivalent - I don’t use Linux but that’s the first thing that springs to mind).

azcoder · September 14, 2003, 8:06am

The Good News - I wasn’t calling glError() before, so I peppered mycode with calls. Sure enough I was getting an “stack overflow” in my drawing code. It was due to forgetting a glPopMatirx(). I quickly fixed that. Good advice rgpc.

The Bad News: I am still not able to render the depth map onto a quad. It just show up all white. No glErrors.

Right now I am in Windows, but I am not using WglShareLists because there is no linux counterpart yet. And I need to be cross-platform. But glCopyTexSubImage2D should still work, right?

Question: Do I need to set glPixelTransferf to transfer depth values before calling glCopyTexSubImage2D?

Thanks in advance for any help

azcoder · September 14, 2003, 8:36am

I think I know what my problem may be. From one of the the nvidia shadow map samples:
“With GL_TEXTURE_COMPARE_SGIX set to GL_FALSE, a GL_DEPTH_COMPONENT texture behaves as a GL_LUMINANCE texture. However, while the depth components may be 16-bit or 24-bit values, be warned that viewing the depth components as luminance values will only show you the most significant bits…”

My hunch is that I don’t have enough variation in the significant bits of my depth values to see anything. I will research further and post again…