glGenerateMipmap takes too long

My application renders text character by character into separate textures.
Since the text changes from time to time, I have to make this process as fast as I can, and
remove as much of it as I can from the GL rendering thread (I use PBOs). However, if I disable
mipmapping and use only GL_LINEAR, the text does not look too good, and with mipmapping enabled,
it can take a long time and disrupts rendering.
glGenerateMipmap takes too much time, up to 100+ msec for text that consists ~30 characters.
Is there anything I can do to make mipmap generation asynchronous?

It doesn’t matter how many characters it takes up; what matters is how big the texture is and how many times you’re doing it.

If you’re rendering strings of text into a texture to display it, there’s really no reason to mipmap it. You should simply render it into the texture at the appropriate resolution you intend to display it.

Thanks for the response.
Unfortunately text can change a lot, it can even be a continuously scrolling information line.
The height of the characters is arbitrary, I added options from 96 to 384 pixel high. This
maybe overkill but I wanted it to looked good even if it is magnified to 2-3 times of the original size.
The average setting for height is 256 pixels. I was very surprised to see that heavy minification (1/4-1/8 of
the original size) can look quite bad without mipmaps (I attached two pictures to demonstrate. One is with GL_LINEAR
and the other is with GL_LINEAR_MIPMAP_LINEAR .
Do you think I have to sacrifice mipmaps or is it a viable option to create mipmaps ‘manually’
and asynchronously using the CPU?

I wanted it to looked good even if it is magnified to 2-3 times of the original size.

Then you should focus on improving how you’re rendering it. For example, you can use this Valve-developed technique for rendering glyphs. That way, you don’t have to render them to a texture. You just render glyphs where you want them and they’ll scale up.

Have a look at for an implementation of Valve’s paper

I am not sure if this technique helps me with this but I am going to try.
The problem I have is minification and not magnification. I render text to
relatively high resolution bitmaps (height can vary from 96-384 pixels) but it is usually viewed at a distance.
If I don’t use mipmapping, the higher the minification the worse it looks. I made some progress with mipmapping (I use
glTexParameteri(GL_TEXTURE_2D, GL_GENERATE_MIPMAP, GL_TRUE) instead of glGenerateMipmap and this seems to
be easier on the CPU but it is still not good enough. Unfortunately I have to render fonts dynamically, I cannot use pre-rendered bitmaps
because I want to support all languages without the nightmare of laying out text myself.
Is it even possible to create all the mipmaps ‘manually’ (by the application) and upload it with a pbo in another thread?
Rendering of the fonts is already done in another thread and I only support multiple core CPUs so I don’t mind.
Is it possible to reserve more memory using glBufferData and upload all the mipmaps at once?
Or do I have to use multiple pbos (one for each mipmap level) and upload its contents to a specific level?

You could use a fragment shader to do a custom filtering on your font texture depending on screensize. Works pretty well actually.

Why exactly can’t you render all your characters into textures once on startup? What rendering code do you use?

Could you elaborate on custom filtering? I have to support Unicode for all possible languages. I tried a text layout engine (Uniscribe) before but I had some problems with certain fonts and certain languages (e.g. to preconvert a Chinese font consisting of ~30-40.000 glyphs would be more or less impossible) and finally I decided I would let the OS take care of that. I do not even handle text as a series of characters, text is rendered line by line as a series of contours created by Windows and I use these contours to create either polygons (works just fine) or a series of bitmaps from them. The latter is more problematic since the only way I know to make them high quality is to use mipmapping. Which can be slow because much of it happens in the rendering thread.

Let’s see if I understand the problem: You have to render some text. This requires to compute Mipmaps when a new text-line is encountered. And that is too slow? How often does you text change then? Assuming that a user needs at least a few seconds to read a line of text it cannot be the case that mipmaps cannot be generated fast enough for this. There has to be another problem - that is a problem in the structure of the problem-solution.
Have you tried to build up a dictionary of word-textures and stitch words together to form a text-line? I’ve made good experiences with such an approach. A dictionary of a few thousand distinct words is often more then enough to form a whole page of text (I don’t know about asian languages though…). This makes new textures and mipmaps to be generated relatively small and hence should increase performance.

Computing mipmaps is only too slow because it happens in the rendering thread. When they are created, one or two frames take much longer to render which causes jumps in the animation. I don’t think dictionaries can be used. The application has to support all possible languages, not just one. I need to be able to render ANY text and on the fly. I still think the best approach could be to create all mipmaps in the application but in a separate thread and upload them asynchronously using PBOs. This way it is only a slightly bigger memory transfer which would more than likely not disrupt rendering. The question is (I have never done anything like this, nor did I find any info on the net): do I need to use a separate PBO with its own mapped memory buffer for each mipmap level or is it possible to upload all mipmaps at once as a continuous buffer?

Either you’re writing very small text or we’re talking about different things. A screen full of text isn’t likely to consist of more than a few thousand words at maximum - regardless in which language. My idea was to generate a map “Word in Unicode->Texture” that contains the last few thousand word that were used, extending it on the fly as needed. When a new text-line is to be drawn onto the Screen it is likely that it consists of or at least contain words already on the Screen somewhere, which means that those words are in the map and hence do not Need to be recomputed. As a word is much smaller than a text-line mipmap generation will be faster. Of course I don’t know your application nor do I know the repitition patterns for east-asian languages but I’d advise to look for the repitition of words (not CPU-words - language words) on the Screen. If the most recent text-line contains words already on the Screen you’re computing mipmaps repeatedly for the same shapes to Display it.
If that ain’t the case - i.e. there are no repititions - you could perhaps generate mipmaps for the text-line to be drawn in a separate thread not using OpenGL for this at all. If the text-line Shows up just one or two Frames after issuing the order to draw it the user is unlikely to see any difference.

We are talking about different things. As I said, I get Windows to render my text lines as a series of contours to avoid all the problems with character to glyph layout in all the exotic languages. Words and dictionaries would not help at all. Even if I render letters and words that already occur in other lines, I would still use a lot less memory than if I convert all the glyphs of a font having multiple language scripts (e.g. simple Arial has more than 3000 glyphs). I render text into fairly high resolution bitmaps so I don’t have problems with magnification, just minification. The visual quality of the text is very important, so the rather imperfect antialiasing bilinear filtering offers is not good enough. I can either find a way to offload as much as possible of mipmap creation to another thread, or use a shader to do something similar to mipmap access. I would appreciate any help if someone has experience with either route.

I render text into fairly high resolution bitmaps so I don’t have problems with magnification, just minification.

And the whole point of the Valve paper is to do the opposite. You use smaller images (so minification works just fine), and you let the distance testing stuff handle magnification. That’s what it’s for: to make magnification look better than “imperfect antialiasing bilinear filtering”.

They suggest to calculate the distance field texture from a 4096*4096 binary image. It is fine if you can create all your textures before the application runs and it is a wonderful technique to save GPU memory, but to do it repeatedly for every character seems to me a little too much effort.

“Too much effort”? You’re using CPU rasterization on strings, uploading them, and then generating mipmaps for them. And you call doing some simple image processing “too much effort”?

And you don’t have to do it “do it repeatedly for every character”; you do it for your string. Stop trying to copy-and-paste their algorithm into your code. Instead, understand how it works and therefore how to best apply it to your needs.

Your current algorithm is:

  • CPU rasterize a string
  • Upload the pixel data
  • Generate mipmaps from it

I’m saying you should change that to:

  • CPU rasterize a string
  • Reduce it to a distance field at a smaller resolution
  • Upload the distance field

I fail to see how computing a distance field is going to be a significant CPU burden. Especially compared to all the time spent rasterizing the string.

This is your opinion and I appreciate it. This is mine:

  1. CPU rasterize strings into moderately sized RGBA textures with colour, shadows, edges, etc. We are definitely not talking about
    GBs of bitmaps. I already have a good working solution.
  2. Upload these textures
  3. Generate mipmaps
    The latter is causing problems, so I might change it to this:
  4. Generate mipmaps with the CPU.
  5. Upload to all mipmap levels. More data will need to be uploaded but it is still not that significant.
    You are suggesting this:
  6. CPU rasterize strings into huge monochrome textures. I need to make a completely new solution
    for this.
  7. CPU calculate distance fields. This is also something completely new that needs to be done.
  8. Upload the lower resolution distance field textures.
  9. Write one or more new shaders to add colour, shadows, edges, etc. Probably needs more
    time in execution than a single interpolated access from a mipmapped texture.
    I feel it is safer to go with the first alternative unless it proves to be impossible.

What about generating the mipmaps one-level-per-Frame?

Perhaps, but I hope it will not be necessary. If I create the mipmaps ‘manually’ using the CPU, I can use a separate core/thread that does it in the background without affecting opengl rendering. I only need to transfer the finished mipmaps from pbo to gpu in the opengl thread. That should be possible.

You are suggesting this:

  1. CPU rasterize strings into huge monochrome textures. I need to make a completely new solution
    for this.
  2. CPU calculate distance fields. This is also something completely new that needs to be done.
  3. Upload the lower resolution distance field textures.
  4. Write one or more new shaders to add colour, shadows, edges, etc. Probably needs more
    time in execution than a single interpolated access from a mipmapped texture.

1: I never said anything about “huge monochrome textures”. You can rasterize them at whatever resolution you feel comfortable with. The paper recommends using a large size because the technique works better when starting from large data. But it still works when starting from smaller data; your current rendering size is probably sufficient. More importantly, you shouldn’t need a “completely new solution” for that, because your current solution should be able to handle an arbitrary resolution. You’re just using Windows’s native CPU text rendering system, which already can render at whatever size you desire.

4: If you had read the paper in any detail, you would find that the overhead is minimal. Indeed, that’s the entire point: it costs very little to implement. You don’t even need shaders to get the basic form of it to work. I don’t know where “shadows, edges” is coming from, since you never mentioned text effects like drop-shadows and the like.

Will it require effort to implement? Of course. But it will be both faster and better than your current approach.

You did not ask if I needed text effects and I did not tell because my question was not about how to make them. I hope I can still make my decision not to follow your suggestion for the time being based on that. I would have appreciated some help about what is the best way to upload data to the mipmap levels of a texture from another thread, but I guess I won’t get it from you.