Low rendering performance

The part of the code where the vertices are submitted to GPU looks like:

   int VertexOffset          = FKeyframeNum * GetActiveVertexCount();
   int NextFrameVertexOffset = FNextKeyframeNum * GetActiveVertexCount();

   glVertexAttribPointerARB( 1, 3, GL_FLOAT, GL_FALSE, 0, FVertices + NextFrameVertexOffset );
   glEnableVertexAttribArrayARB( 1 );

   glEnableClientState(GL_VERTEX_ARRAY);

   // normals
   if (FNormals) 
   {
      glNormalPointer(GL_FLOAT, 0, FNormals + VertexOffset );
      glEnableClientState(GL_NORMAL_ARRAY);
   } 
   else
   {
      glDisableClientState(GL_NORMAL_ARRAY);
   }
   // textures
   for (int Tex = FrameBuffer->GetRendererExtensions()->GetMaxTextureUnits()-1; 
        Tex >= 0; --Tex )
   {
      glActiveTextureARB(GL_TEXTURE0+Tex);
      glClientActiveTextureARB(GL_TEXTURE0+Tex);

      if ( Tex <= static_cast<int>( FTexCoords.size() ) ) 
      {
         glTexCoordPointer( GetAllocationInfo().FTexChannels, GL_FLOAT, 0, FTexCoords[Tex] );
         glEnableClientState(GL_TEXTURE_COORD_ARRAY);
      }
      else
      {
         glDisableClientState(GL_TEXTURE_COORD_ARRAY);
      }
   }

   glVertexPointer(3, GL_FLOAT, 0, FVertices + VertexOffset );

   GetIndices()->DrawElements( GetPrimitive(), GetActiveVertexCount() );

Maybe something is bad here?

FrameBuffer->GetRendererExtensions()->GetMaxTextureUnits()
Is that actually making a glGet* call? If so, that’s generally not a good idea. If you really need to ask this every time you render something (hint: you don’t), then cache the value.

I see a lot of gl*Pointer calls, but I don’t see any calls that actually bind the VBO(s) containing the vertex data. What does that look like?

man, no offense but thats some terrible shading shader
fix the ****, before u bitch
about performance

keep it real, nuf said

btw with
VBO format is: vertex (3 floats) + normals (3 floats) + texcoords (2 floats)
how is it possible to do bumpmaping with this (unless u use a fixed tangent of vec3(1,0,0) etc)

In 320x240 FPS increases up to 25 FPS (VS 20 FPS in 800x600). Seems i’m not fill-rate limited.
thus less pixels need to be drawn + fps increases + youre !not! fillrate limited (run it by me again), just joking :slight_smile:
personally i think youre drawing too many triangles per terrain bacth, from my testing 33x33 or 65x65 verts is about ideal wrt performance (perhaps 129x129 if u have ****e lighting, ie not spot or point lights)

edit-
(for the love of the lord, jimi hendrix, this sticking of asterixs instead of words is childish)

Originally posted by Korval:

Most applications will have to call it once per rendered object. Having 11 calls suggests having 11 objects, which is not alot.

Maybe I’m terribly wrong but as far as I can recall most VBO setup work is done in the glVertexPointer call. I don’t see why you need to respecify the VBO format every frame.

I can render loads more than 330 polygons per second using immediate mode, each of which implicitly needs to do whatever glVertexPointer does.
I don’t think rendering a polygon in immediate mode requires calling glVertexPointer at all. I guess there’s an internal VBO for that case, the driver just needs to glBufferSubData into it every frame.

Is that actually making a glGet* call?
It isn’t. All this values are precached.

but I don’t see any calls that actually bind the VBO(s)
Here they go:

void clVBOVertexArray::FeedIntoGPU() const
{
   glBindBufferARB(GL_ARRAY_BUFFER_ARB, FID);

   clVertexArrayFeeder::FeedIntoGPU();

   // clean-up
   glBindBufferARB(GL_ELEMENT_ARRAY_BUFFER_ARB, 0);
   glBindBufferARB(GL_ARRAY_BUFFER_ARB, 0);
}

Code for the clVertexArrayFeeder::FeedIntoGPU() method was actually posted before and the call GetIndices()->DrawElements() looks like:

void clVBOElementsArray::DrawElements(Lenum Primitive, int ActiveVertexCount ) const
{
   UnLock();

   int VertexCount = ( this == VAManager->GetCommonIndices() ) ? ActiveVertexCount : FCount;

   glDrawElements( Primitive,
                   VertexCount,
                   FShortIndices ? GL_UNSIGNED_SHORT : GL_UNSIGNED_INT,
                   0 );

   FrameBuffer->UpdateStats( Primitive, VertexCount );
}

FShortIndices is always FALSE for now.

Yet another piece of code mentioned here:

void clVBOElementsArray::UnLock() const
{
   glBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, FID );

   if (!FLocked) return;

   FATAL( glUnmapBufferARB(GL_ELEMENT_ARRAY_BUFFER_ARB)==GL_FALSE,
          "Unable to unmap GL VBO elements buffer");

   FIndices = NULL;  

   FLocked = false;
}

FLocked is always FALSE in this case.

Maybe I’m terribly wrong but as far as I can recall most VBO setup work is done in the glVertexPointer call.
Because there’s only one set of vertex pointer state?

To render something out of a VBO you need to do the following steps:

  • Bind the VBO.
  • Set the gl*Pointers with offsets for each of the attributes you need to use.
  • Call glDraw*.

For each object. Unless you’re drawing the same object over and over again, you need to call gl*Pointer for each render when you change the offsets.

For Sergey:

I’d download glIntercept and get a log of the OpenGL calls you’re making. It looks very much like something pathological is going on, but the code alone isn’t enough to be able to tell what’s going on.

I make more calls to glBindBuffer and gl****Pointer that what he has and I have a inferior GPU. I can get from 100 to 200 FPS.
The less calls the better but 11 is nothing to worry about.

Release your exe and maybe others can test it for you.

Ysaneya wrote:
Indices, unsigned int. Anything else is dangerous.
Depending on hardware, I’d say anything larger than unsigned short is “dangerous” for performance. Probably not applicable for this case, but I wanted to add it for completeness.

Shaders have way to much varying’s. Try to reduce it somehow. Try to test with simple shader.