Rendering large amount of text.

M.Mortier · May 26, 2004, 5:16am

Hello,

I’m a bit new to openGL (well not really but I’ve been having an on-and off relationship with it for quite some time as I don’t make 3d applications very often). My apologies if this belongs in the Beginner’s section.

But, I wanted to render large pieces of text on the screen. This appears to be a bit unnatural to do…
Here are two strategies (in descending order of naivity) I’ve tried, and their problems.

(The display lists I’ve used are “compiled” ones, of course…)

Init : Store each character in a display list (a textured quad).
Draw : Render text by calling the display lists for each character that is supposed to be drawn right now.
Problem : big slowdown from 5000+ letters on, due to the high pixel fill, I think…(I want to be able to zoom in and out of the text), since all the quads face the camera. Is that the reason why it slows down there?
Init : Store blocks of 1000 or so signs in display lists.
Draw : call the lists that effectively need to be drawn.
No improvement whatsoever.
A strategy I want to try, but I don’t know how (I don’t think it’s that hard using buffers but your opinions on its efficiency would help me a lot in thinking straight).
Render blocks of text using method 2) or 1) to a buffer, and then store that buffer as a texture.
Then use it on big quads.

Is method 3) the most efficient way of doing this? It needs a lot of memory I think, storing all those rather big textures in VRAM…? (if I want to store them in display lists) Supposing I work with 1Mb+ files, I don’t even think they’d fit.

Does anybody have any better ideas for me to try, or tell me what I’m doing wrong?

I do want to work in 3d and no rasters or anything.

Thank you lots for any help,

M. Mortier

yooyo · May 26, 2004, 8:22am

Try to put font in texture. Then for each letter build mapping coordinates and render it using quads.
It will be fast enough.

I have few classes that can build texture from any font in Windows, take care about kerning and spacing and render strings. I can send to you if you want it.

yooyo

imported_jwatte · May 26, 2004, 9:30am

If you render lots of text, then my opinion (based on doing a lot of this) you should use a native bitmap (GDI, libttf, whatever) and native font renderer to draw the text, then upload that bitmap and use it as a texture containing all the text.

Typically, you can get away with a grayscale bitmap (8 bits per pixel) and uploading it as GL_ALPHA format, to save on texture memory.

fathom · May 26, 2004, 10:49pm

sounds like he’s doing textured quads already.

are you sure your slow-down is fill related? how big are your quads? what card and os? you got a sample frame?

how are you generating your textures? many bitmap font systems tend to just assume a fixed width for fonts and not worry about the empty pixels. probably ends up taking 2 or 3 times as long per quad as needed because of this.

make sure that your quads are only big enough to get the letter on screen. figure that each letter in your texture is predominantly black/transparent. the less of this black/transparent texture that’s dealt with, the better.

one thing that could probably help quite a bit would be to create low-res “outlines” that fit your font texture a little better.

take “A” for example. if you were to use a textured triangle that approximated the shape of the A instead of a quad that had bunches of useless space, you’d speed things up. of course, generating the shapes is not trivial.

evanGLizr · May 26, 2004, 11:14pm

Originally posted by M.Mortier:

But, I wanted to render large pieces of text on the screen. This appears to be a bit unnatural to do…

Some ideas:
a. Partially mentioned by jwatte and yourself, try to cache words: render full words to a texture (either with OpenGL or GDI) and reuse that texture.
b. Cull the letters/words that are not visible (are really all those 5000 letters visible on the screen or are you doing some kind of scrolling?).
c. Alpha test the letters (this will save you fill rate if you are fill-rate limited).

Using one display list per letter is overkill in my opinion. If anything, combine it with approach a) (cache words) by creating one display list of the whole word rather than of just a letter. If the text is dynamic, you probably want to use vertexarrays (VBOs, etc) instead of display lists.

My suspicion is that you are CPU bound, rather than fill rate bound (how big are your letters? how many have you got really visible on the screen? are you alpha blending?).

M.Mortier · May 27, 2004, 1:40am

Well,

Yes, I’m already using textured quads. Perhaps I was a little ambigous.
I have a texture that contains the 256 characters, and I use offsets of that texture to draw each letter, in a quad. I store these letters in display lists.
In method 2) I then group 5000 of those letters (and y-translations after eahc line too of course) into another compiled display list. That doesn’t improve anything though.

One of you said I should use words instead of letters. Let’s assume I do that (I may as well split the text up into lines then), how can I then render one word in one quad, using only that texture? I think I’ll need a quad per letter, either way to be able to set all coordinate points of the texture? Or can I effectively spread various parts of the 256char texture over one quad?

I have a PIV1.8 with 1G RAM and a 64Mb GFIII card, I don’t think I’m CPU bound. I just think there’s too may quads facing the camera? (I’m also using blending and smooth textures, which slows the whole thing down a bit (just a bit though). 5000 quads slows everything down that’s for sure.
I don’t have a screenshot cause I have to run 1 mile from my home to here to be able to be online. I’ll bring it next time I do that.

Well, I think a native bitmap renderer that would render blocks of text to a texture would be a good idea. Just a little worried about texture size, but I guess that’ll be allright.

The only thing I feel that is missing is some more optimizing. I mean, if I have the sentence “I am an ape”, then for each “a” the offset to the 256character texture is the same. Why can’t I use that to my advantage and cut texture size for the texture that represents the whole sentence?

M.Mortier · May 27, 2004, 1:50am

And I am culling btw, I just wanted to be able to draw at least 5000 letters on screen (to zoom out of the text…thing is I want to make a dynamic reader kind of thing, for large raw text documents…so you can mark certain things, zoom out massively, and then flow back to certain marked spots… It’s an exercise in openGL really more than anything else, I know it’s a bit silly to do it in 3d, but still…there seemed so much “shared reference” optimization possible, no?

I’ll try the bitmap render thing… I actually had the weird idea in mind at first to put the text in an offscreen tk widget or s’thing and then throw that widget’s graphics to a bitmap buffer (taking care of text wrapping etc while we’re at it) - but that doesn’t work since I need to store all the text somewhere, not just the part that is visible in one window.

JustHanging · May 27, 2004, 7:06am

Hi,

Unless you’re drawing letters on top of each others, there’s no way you’re fill limited. 5000 quads shouldn’t choke a GF3 either, so it’s propably something else. Please make sure you’re not

Binding a texture before each letter
Using glTranslate to place each letter

What you should do is create vertex and texture coordinate arrays of the entire text (or blocks of it since you’re culling) and draw it all using a single drawArrays call. You don’t have to update the arrays unless the text changes.

Oh, one more thing, you actually could be fill-limited if you’re zooming out without using mipmapping. An easy way to see if you’re running out of fillrate is to make the window smaller and see if it improves the framerate.

-Ilkka

Madoc · May 27, 2004, 8:22am

We have an app that draws far more than 5000 characters at once. We use tricks when they are distant and have good culling schemes but they’re not even necessary on something like a GF3.

You can’t possibly be fill limited, as JustHanging said, unless you’re drawing those characters really big and many times overlayed. Exclude that, I wouldn’t do more than make sure you’re using a single channel texture (ie only alpha) for the sake of efficiency.

From my experience, getting the text into biggish vertex arrays should be more than enough for that many characters, 10k unlit tris is not much for your GF3. Once using vertex arrays, VBO is always a good added bonus. I doubt display lists are a good choice, probably more so if you’re filling them in immediate mode.

I haven’t tried myself but it could be good to use words for the texture coordinate so you get smaller 16 byte aligned vertices (unless you’re using 2D vertices, something we can’t do, but smaller may still be better).

Edit:
I thought there might be some confusion about “words”, which is intended as 16 bit values, of course. An apt thread for the ambiguity of the term .

imported_jwatte · May 27, 2004, 6:02pm

My suggestion has very little to do with textured quads – there is only one quad per block of text. This solution (upload the text as a bitmap texture) is pretty much always the fastest, unless you’re really, really low on texture memory (which you usually aren’t).

Alpha test the letters (this will save you fill rate if you are fill-rate limited).
All modern cards implement alpha testing/blending very close to the memory controller, so if you touch ANY of the pixels in a “block” (4x4 pixels on some architectures, I think) then you pay for the entire block. Thus, alpha testing is only a win if you have wide swathes of transparent space that each is at least 7x7 pixels or more of transparency. That typically only happens on really large font sizes, which in turn mean that you can fit nowhere near 5,000 characters on screen…

Korval · May 27, 2004, 7:33pm

This solution (upload the text as a bitmap texture) is pretty much always the fastest, unless you’re really, really low on texture memory (which you usually aren’t).
Not necessarily. If the text, for whatever reason, needs to be uploaded frequently (a score, a timer [with decimal second precision], etc) you will be better served rendering it directly.

In any case, dropping 5000 letter-sized quads, especially if you can put them in a vertex array, shouldn’t be any real problem for any card.

evanGLizr · May 27, 2004, 7:52pm

Originally posted by jwatte:
[quote]Alpha test the letters (this will save you fill rate if you are fill-rate limited).
All modern cards implement alpha testing/blending very close to the memory controller, so if you touch ANY of the pixels in a “block” (4x4 pixels on some architectures, I think) then you pay for the entire block. Thus, alpha testing is only a win if you have wide swathes of transparent space that each is at least 7x7 pixels or more of transparency. That typically only happens on really large font sizes, which in turn mean that you can fit nowhere near 5,000 characters on screen…[/QUOTE]True, but he mentions that he’s also zooming into the text, plus I don’t think having 4x4 empty aligned texels is so uncommon specially if you handle your font in a non proportional way (smallcaps, L, J, P, etc).

Edit: Not that I think that he’s fill-rate limited anyway…

Madoc · May 27, 2004, 10:57pm

I have to say Jon Watte’s suggestion is very interesting. Where you have a certain amount of static text it’s clearly the most efficient way to render and it has the added bonus of a broad choice of fonts and hassle-free proportional fonts. I haven’t actually heard this approach suggested before despite it’s simplicity.

Unfortunately, it wouldn’t work easily for our app. Texture memory is a problem and it would be complicated with the size of some of the fonts. Needless to say, our app doesn’t display text in a conventional way, we would need many thousands of high-res blocks of text.

def · May 28, 2004, 1:12am

Have you considered the display lists as the performance killer?
If you are just storing glTexCoord and glVertex calls you should’t see any improvement with display lists but you might be experiencing the display list overhead accumulating with 5000 letters…

M.Mortier · May 28, 2004, 4:16am

Well, you’re right I wasn’t fill limited at all…- changing the window size doesn’t really do anything to the framerate. I suppose I’m cpu limited then like a lot of you said (although I’m not sure how that works when I’m putting everything in a compiled display list - and I’m not binding a texture before each letter either btw. I thought display lists like those were stored in the video hardware. But I was probably wrong then.)
I’m a bit puzzled by what’s happening then… Are you sure that when you change the window size the effect should be noticeable when you have fillrate problems? (I mean perhaps in some situations the effect isn’t linear but logarithmic so it could be barely visible?)

Well thanks for all the advice, I’ll try to combine everything and post back when it works - once I figure out how to store vertex/texture arrays in the 3d hardware instead of in the system, or find a free font renderer that works on multiple platforms…

Madoc · May 28, 2004, 4:39am

M.Mortier, have you actually tried using vertex arrays (DrawArrays or DrawElements)? If not, do this first. If you then want to use video or agp memory for them, use VBO, which is really easy to use.
As def said, display lists themselves might be a problem. Your final display list is very likely doing 5000 glCallList and nothing else. Display lists are fundamentally a means of batching commands, don’t expect them to do anything exceedingly clever.
I would guess that even just drawing everything in immediate mode will be faster. You could start by putting everything in one display list directly in immediate mode without the per character DLs.

Forget fill, really. Yes, if you shrink your viewport considerably and there’s no (or hardly any) improvement you are not fill limited.

imported_jwatte · May 28, 2004, 12:33pm

we would need many thousands of high-res blocks of text
Do you draw them AT THE SAME TIME though?

You could set aside one big texture, or several smaller textures, for the text that’s actually drawn during the current frame. You’d set aside more space for the formatted bitmaps in main RAM, which hopefully is more plentiful than VRAM.

Then, when rendering, you’d use an LRU cache of texture images; when you need to render a specific block of text, you see if it’s already in a texture; if so, bind the texture, and put the texture first in the LRU list. If not, you take the texture last in the LRU list, and TexSubImage() your prepared data into it, and stick it first in the list.

Your LRU list should be bigger than the rendering needs of a single frame, for ideal frame rate

Madoc · May 28, 2004, 11:03pm

I kind of assumed it was obvious that I didn’t mean at the same time. The real problem is that we’re already dealing with thousands of images that need to be swapped in and out and cleverly LoDed.
The other thing is that we’re not just displaying text but a large 3D environment tapestried with it. It is actually posible to see it all at once and it has to be well mipmapped. Of course, we don’t actually render the real text when it’s small but we need detailed knowledge of the formatting to make a convincing fake.

I think it’s an excellent method, but not for this application, it would hit on existing bottlenecks.

fathom · May 29, 2004, 12:59am

m.mortier, just for grins, draw as wireframe or even disable the actual draw command all-together just to get a guage of what system is affecting your slow-down. you might find something totally unrelated is messing things up.

if the wireframe is fast, then try disabling texturemapping. if it’s fast without texturemapping enabled, then there might be some issue with your texture creation (like somebody mentioned resampling earlier)…

zeckensack · May 30, 2004, 12:38am

Re texture memory of jwatte’s model:
There’s an upper bound, actually. You don’t really need more texels than you have pixels in your viewport. If the view changes, you can discard whatever won’t be visible anymore to make room for the new stuff.

Of course it’s more clever to use a bit more space and do some caching. Say you have a 1600x1200 viewport and use ALPHA8 textures and settle for 3x “needed” plus full mipmaps, you end up with 316001200*1.3 bytes ~=7.5megs of texture memory. Peanuts …