instancing objects cause lag(python)


Hello,this question might be asked a million times already,so I would be happy even if someone provides a link to a post or document where i can read about a solution.
So my issue is:
With my basic knowledge I managed to do render text and with glfw to make it dynamic(basically when the user types OpenGL renders that letter).I exported from Blender a obj file that has all the letters and symbols I need. Then with an OBJ parser I went trough all of the lines and parsed all the vertex and index data as well as the object names(letter names).In the end the OBJ parser returns a dictionary with assigned VAO, indices,vertex count,bounding box for every letter.
then in my text class,in the draw method I loop trough a string and assign the relevant VAO for the specific letter and do some offsets so the next letter is rendered after the previous.
All good but I noticed that every instance drops the frame rate really hard.I use the ms per frame method from FPS tutorial.And at instance 100 it’s like 10ms per frame. I want to use the text for UI of course,but if 100 characters lower the performance so greatly it’s kind of not worth it.
here is the code of the draw method:

#        for l in string:

            if l == r" " :
                x_pos += self.size_x_offset
            elif '
' == l:
                y_pos -= self.size_y_offset
                x_pos = 25
                increment = 0
            elif '	' == l:
                x_pos += self.size_x_offset*4

                glUniform3f(self.color_loc, *self.color_text)


                glDrawElements(GL_TRIANGLES, self.letters[l][2], GL_UNSIGNED_INT, self.letters[l][1])



                x_pos += self.letters[l][4][3]*self.size

                increment += self.size*.05

No doubt I’m doing something wrong…but what is it?
I hope some of you guys can point me to my error and what I can do to fix it.
I doubt that OpenGL is unable to handle many instances.And it’s not my hardware…because Blender handles a lot of instances with no issue on my computer.


Some tips to make it faster:

First, don’t use a different VAO for each letter. Store all of the letters in a single VAO, and select specific letters through the [var]count[/var] and indices arguments passed to glDrawElements(). Also, you shouldn’t need to bind and unbind the shader program for each letter.

With that, the per-letter overhead should be reduced to the glUniform* calls and a glDrawElements() call for each letter.

But you can improve performance further by eliminating the glUniform* calls and rendering multiple letters with a single glMultiDrawElements() or glMultiDrawElementsIndirect() call. To eliminate the glUniform* calls, either

[li] Add an integer vertex attribute (use glVertexAttribIPointer to pass integer attributes) which holds the index of the letter within the string, and use this to index into a uniform array of per-letter data. This may need to be in a UBO, SSBO or texture, as the amount of space in the default uniform block is limited. Or:
[li] If you can rely upon support for OpenGL 4.6 or the ARB_shader_draw_parameters extension, you can use gl_DrawID, which holds the index of the draw operation within a multi-draw call.


Thanks for the suggestions.
I guess I have to look up how to do those things.I tried a single shader assignment and got about 15-20% boost.I also tried the UBO setup…to have a single Uniform call,even though I guess I haven’t set it up correctly because I am not seeing the rendering,still the ms/frame counter showed about 50% boost with 200instances now taking 100ms.
I haven’t looked up the glMultiDrawElements() yet. Will try to find some info after I figure out the UBO setup.

Also about the single VAO…basically your suggestion is to have all of the vertex data for the letter shapes,stuffed in to one big list/array,and all the index data of all the shapes in to another list/array.Then feed that to the VBO and EBO respectively.Then have a python dictionary only with the letter as key and it’s indicies,and use those for every letter?
You suggested that I feed some kind of index number to a buffer and then call it for every letter. I wonder how will I know which index represents which letter? With the python dict was really easy setup since I had the letters as the keys,so every letter in the string called different data.Rather than having indexes in a list/array for example.


a bit of double posting,but to give some update:
I tried the single VAO for all of the letter shapes(meshes), and that also gave me a speedup. with about 100 instances now taking 30-40ms.

I still haven’t figured out the glMultiDrawElements() thing.I can’t find any proper tutorials on the topic.But I did try this:

  1. in the draw method,I first loop trough the string and take all the indices from that python dict related to the letters of the string.As well as number of points.
  2. I add all that to a new list(array) for indices and an increase an integer for number of points
  3. made a single draw call with glDrawElements() giving it the new list of indices and the new number of points
    all letters are one on top of each other ,since I don’t know how to translate each letter mesh at the draw call.I was thinking about doing this via the vertex shader…but I wonder is it really possible?

However the good thing is …that now at 100 instances it was like 10-14ms per frame.And I was able to reach 600 instances for 100ms.
So overall the single draw call was like *10 speed improvement,but the downside of this is that now I have no clue how to offset the shapes.

Today I tried the “FreeType/render me a texture” -method and the performance was pretty much close to the single VAO mesh thing I did, with 100 instances gave me around 20-30ms per frame,but I did this test on a computer with more powerful VGA(Radeon RX 570) while the above tests were done on a laptop with(Nvidia GT 740M). I will try the FreeType method on the laptop tonight to have more accurate comparison.

But still it will be great if you guys can give me some hints on how to use a single draw call and have each shape with the proper offset,and not all in one place.


The call:

glMultiDrawElements(mode, count, type, indices, drawcount);

Is equivalent to

for (int i = 0; i < drawcount; i++)
    void glDrawElements(mode, count[i], type, indices[i]);

So you need to get your code into the latter form, i.e. not changing anything (uniforms, program, VAO, etc) between draw calls. By itself, that may improve performance.

How are you positioning glyphs at the moment? You show the code to maintain the variables [var]x_pos[/var] and [var]y_pos[/var], but you don’t show how those are being used.