Texture access in Vertex Shader slow?

Hi,

I encountered a (I think really strange) problem:
I have a 512x512 GL_RGBA32F-texture containing vertex positions which should be rendered as points. I do that by setting up a VBO which contains tex coord of each texel and render them as points. In the vertex shader, I use the gl_Vertex.xy to access my texture and set the gl_Position to the texel value (see vertex shader code below). The Frag shader only sets a color to mark the pixel.
In another project, I got about 600 fps on an NVidia GTX260 when doing this, which seems rather good. Now for the strange part: I set up exactly the same thing in a new project and now I get only about 120 fps! Since I didn’t change anything in the shaders, I suspect that I set something up the wrong way. Are there some OpenGL states which could cause this extreme drop of frame rate?

I know this problem is rather unspecific, but I would appreciate any help or suggestions!
Best regards, Michael


#version 120
#extension GL_EXT_gpu_shader4 : enable

uniform sampler2D positionTex;

vec4 texel = vec4( 1.0);

void main(void) {
    // read vertex position from texture
    texel = texelFetch2D( positionTex, ivec2( gl_Vertex.xy), 0);
    gl_Position = gl_ModelViewProjectionMatrix * vec4( texel.xy, 0.0, 1.0);
}

PS: It doesn’t matter if I use texelFetch2D or texture2D…

Hey all,

I dug around a bit in my code and I noticed an even stranger behaviour! This is my rendering code, which uses the shader posted above (it is the “drawPointShader”):


    // render spheres to framebuffer object
    glViewport( 0, 0, this->texWidth, this->texHeight);
    glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, this->visibilityFBO);
    glClear( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    glClearColor( -1.0f, 0.0f, 0.0f, 1.0f);
    this->RenderSpheres();
    glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, 0);
    // reset viewport to normal view
    glViewport( 0, 0, this->width, this->height);

    // START draw overlay
    glMatrixMode(GL_PROJECTION);
    glPushMatrix();
    glLoadIdentity();
    glMatrixMode(GL_MODELVIEW);
    glPushMatrix();
    glLoadIdentity();
    
    glOrtho( 0.0, double( this->sphereCount), 0.0, 1.0, 0.0, 1.0);

    glPushAttrib( GL_LIGHTING_BIT);
    glDisable( GL_LIGHTING);

    // set viewport, get and set clearcolor, start rendering to framebuffer
    glClearColor( 0.0f, 0.0f, 0.0f, 0.0f);
    glViewport( 0, 0, this->sphereCount, 1);
    glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, this->visibleSpheresFBO);
    glClear( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

    // enable and set up point drawing shader
    this->drawPointShader.Enable();
    glUniform1i( this->drawPointShader.ParameterLocation( "positionTex"), 0);
    // bind textures
    glBindTexture( GL_TEXTURE_2D, this->visibilityTex);
    // draw points
 
    // ... draw points via precomputed fbo containing the texture coordinates [0..w][0..h]

    glBindTexture( GL_TEXTURE_2D, 0);
    // disable point drawing shader
    this->drawPointShader.Disable();

    // stop rendering to framebuffer, reset clear color
    glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, 0);
    glClearColor( clearCol[0], clearCol[1], clearCol[2], clearCol[3]);

    // reset viewport to normal view
    glViewport( 0, 0, this->width, this->height);

    glPopAttrib();
    glMatrixMode(GL_PROJECTION);
    glPopMatrix();
    glMatrixMode(GL_MODELVIEW);
    glPopMatrix();

It is sort of a visiblity test for the spheres which are rendered first.
I spotted a bug: the first glClearColor in line 5 obviously should be in front of the glClear in line 6 to have any effect. Yet, if it is like that, I get nearly 600fps as mentioned above (with the bug), but if I move it two lines up (in front of glClear), my frame rate drops to 100fps (without the bug).

I am checking for GL errors (and get none), so I think that I didn’t completely mess up any GL states.

I’m completey lost and confused by the whole thing. Any ideas?
Best regards, Michael

How about posting a small working test prog for folks to try. Will get you some quick feedback on other cards and drivers. Also easier to look for problems that way. And you might find your problem as your distilling out code and find it was one of yours.

And I don’t think you actually said the card you were having this problem on was the same one you got your original good frame rates with.

Also, read this: Performance (Humus) and The evils of fps (RTR blog).

Hi,

I will see if I can write a small test case, thank you for this suggestion! I was also considering driver issues, but the other program runs on the same machine with higher frame rates. I surly is a mistake which I made, but as I said I am not able to spot it and I get this somewhat strange behaviour I mentioned in my last post.

As for the two articles: you are obviously right that just fps is bad for performance measurement, but the two programs do exactly the same (and nothing else). Since I am not comparing the performance of two different techniques, I think the comparison makes sense, even if it is inaccurate and not very meaningful, but the huge difference in terms of speed remains. For a comparison or an evaluation in a technical paper I totaly agree with you that using only fps is no good idea - but they can help you getting an overall impression of the performance of your program.

Best regards, Michael