Best way to draw small many small objects, redraw interactively: is it possible/advisable to draw without VBO?

For drawing lots of small shapes, the most efficient way I found in OpenGL was to create an array of points, and array of indices, and use glDrawElements. For example, drawing multiple rectangles for a web page.

Our current design has a MultiShape2D object that uses a vertex buffer of points, and then draws solid shapes, and then lines and points on top of the solid shapes using glDrawElements.

But for a GUI element that is changing as the cursor moves over it, refreshing is a problem.
It seems like the easiest way to draw a GUI object that is small might be to just have a render function which draws the shape – why bother to create a vbo if the numbers constantly change anyway? For example, if I created a button with a rectangle and text, and clicking on the button changes the color of the rectangle, and changes the text, then this would require updating all the values that made up those shapes. And since the objects are not huge, we might be talking about a vbo for the rectangle composed of 4 points (8 floats), another vertex buffer object for the letters with 44number of characters.

Before vertex buffer objects, opengl supported drawing arrays to the screen. I am having trouble finding the call that passes in the array. So the questions are:

  1. Is it better to always use vertex buffer objects even for small objects like a single rectangle?
  2. Is it possible to just draw an array without creating a vbo, and if so can someone point me to an example? I could use immediate mode but that is deprecated.

with a vbo, the code might look like:
void init() {
glGenVertexArrays(1,&vao);
glBindVertexArray(vao);

  glGenBuffers(1,&vbo);
  glBindBuffer(GL_ARRAY_BUFFER,vbo);
  glBufferData(GL_ARRAY_BUFFER,sizeof(GLfloat)*
        vertices.size(),&vertices[0],GL_DYNAMIC_DRAW);
  //Describe how information is recieved in shaders
  glVertexAttribPointer(0,2,GL_FLOAT,GL_FALSE,0,(void*)0);

}
void render() {
  shader->setMat4("projection",*(parentCanvas->getProjection()));
  shader->setVec4("solidColor",style->getFgColor());
  glBindVertexArray(vao);
  glEnableVertexAttribArray(0);
}

Consider for example, rendering a box with text on it, and highlighting a word. Rendering the box requires 4 points and a color. Rendering the words renders each letter using a texture. These require different shaders so they are separate operations. Highlighting a word with a different background color would require drawing a different rectangle.

I’m assuming the easiest and fastest way to do this would be to draw the entire rectangle, specifying a color, then drawing a second rectangle with the highlight color. The highlighted rectangle is drawn twice. Then text is drawn on top of that. We currently specify the color as a uniform attribute in the shader.

If I scale that up to hundreds of text boxes on a screen, what is the least-cost way of doing this?
I don’t just mean optimum speed, but minimizing the complexity of the code.

It’s for this reason that I was thinking that for smaller interactive GUI objects, it would be easier to just have all the code in render, no initialization, no vertex buffer objects.

Asking this question makes me realize I don’t even know how a shader works at the lowest level. Suppose I select the following vertex shader:
#version 330
layout (location = 0) in vec3 aPos; // the position variable has attribute position 0

uniform mat4 projection;
uniform vec4 solidColor;

out vec4 ourColor; // output a color to the fragment shader

//Add a uniform bool to check if to use vertColor or Solid color
void main() {
	gl_Position = projection*vec4(aPos, 1.0);
	ourColor = solidColor; // set ourColor to the input color we got from the vertex data
}

Can I set projection and color, draw a rectangle, then set color again and draw a different rectangle?
This is aside from the fact that it may be considered better to have a list of colors and that I shouldn’t do it.

Nobody’s stopping you from putting those in the same buffer. Or just every frame, writing all of your UI data to a buffer, whether it changes or not.

Well, if you’re using the core profile of OpenGL, there isn’t one.

And if you’re using compatibility GL, then it’s the same one you used for core, just without binding a buffer to GL_ARRAY_BUFFER. That’s why the function ends in Pointer.

Well, unless you’re using separate attribute formats, in which case no, there’s no way to do it because reading from things that aren’t buffer objects is silly and no GPU actually supports it.

See, that’s the thing. When you render using client-side memory, you’re really just making the implementation copy your data into GPU-accessible memory and then having the GPU read from that. So it’d be better to just use a buffer object with appropriate streaming techniques.

If I just changed a color, then it would be easy to update the data and redraw. But if it’s text, and text is inserted, that means new points inserted into the buffer. That’s why a single buffer with all the points is unappealing.

Note that the vertices/triangles for the characters don’t have to be in any particular order. If you want to insert characters into the middle of a string, you can just append the new data to the end of the VBO/EBO.

In terms of performance, probably the main thing is to separate data which changes frequently (every frame or few frames) from data which is largely static. For rapidly-changing data, construct the buffers, render the data, and recycle the buffers each frame. Overwriting data in-place risks synchronisation.

For almost-static data, figure out if you can implement the dynamic aspects separately from the static aspects. E.g. for text which is static but with dynamic highlights, draw the background, draw any highlight rectangles, draw the text. Or if you’re changing the foreground, consider whether you can draw the highlighted text over the top of the normal text, or implement the highlight using blending or logic ops. If you need to briefly prevent something from being drawn, a stencil may be more efficient than having to rebuild an otherwise-static element array.

How does writing data into the buffer risk synchronization? The API is single threaded, the copying is happening while drawing is not happening, right?

Usually, drawing is always happening. Most OpenGL functions simply append a command to the queue and return immediately; the GPU executes the command at some point in the future.

If you modify a buffer which is a source of data for a pending command, the driver either has to make a copy of the existing data or (more likely) stall until any pending commands which use that data have completed. The only exception is if you map a buffer with GL_MAP_UNSYNCHRONIZED_BIT, in which case any pending commands may use either the old or new data (depending upon exactly when the GPU gets around to executing them).

This is detailed in the link on buffer object streaming I gave you.

True.

Not necessarily. If max performance and/or dev time matters.

In practice you’ll find it a considerable challenge to meet much-less-beat the perf of some OpenGL driver’s client arrays implementations with buffer object based implementations. Why? Those client arrays streaming implementations were coded by driver gurus that have complete access to the GPU specs, hardware, and low-level driver state, and can hand tune these implementations for best performance on their GPUs.

So by all means, give it a go! It’s definitely educational and fun to implement Buffer Object Streaming methods. However, bench what you end up with against client arrays (providing vertex data in client-side memory). You may very well find you’ve still got more tuning to do!

Ways to draw this kind of data, as OpenGL has evolved, include:

1.0: Immediate mode.
1.1: Client-side vertex arrays.
1.5: Vertex buffer objects.
2.0: Generic vertex attributes.
3.x: VAOs, buffer object streaming.
4.x: Separate attribute format, buffer storage.

The big change came with OpenGL 3.x core profiles, which require a VAO to be bound for all drawing, and data must be sourced from buffer objects. If you’re using a core profile, then you must use buffer objects and you should probably look at streaming and/or persistent mapping techniques for getting best performance.

Likewise, if you’re using VAOs at all, even in a compatibility profile, then you also must source your data from buffer objects.

So if you want to use either immediate mode or client-side arrays, then you will be using deprecated functionality.

Note that in all of these cases I haven’t mentioned shaders; it’s 100% possible and legal compatibility-profile OpenGL to use buffer objects without shaders, or shaders without buffer objects. This should be obvious (from the fact that buffer objects were in GL1.5 but shaders in 2.0, if nothing else), but you still sometimes see people getting mixed-up on this score. (Likewise using buffer objects doesn’t require VAOs either, and that should also be obvious for the same reason.)

Basic steps to source vertex data from client arrays include:

  • Bind VAO 0 - you’ll often see this as “unbinding” in tutorials, but it’s actually not - what it actually does is revert OpenGL behaviour to pre-VAO OpenGL, where vertex attribute state is global to the context rather than contained in a container object.
  • Bind 0 to GL_ARRAY_BUFFER - likewise, this is not “unbinding”, but instead tells OpenGL to source vertex data from client memory (i.e. system memory pointers).
  • Use system memory pointers for your gl*Pointer calls instead of those weird-looking offsets-from-zero.
  • Draw stuff.

For immediate mode, this is slow:

	for (int i = 0; i < count; i++)
	{
		glBegin (GL_QUADS);
		glVertex2f (..);
		glVertex2f (..);
		glVertex2f (..);
		glVertex2f (..);
		glEnd ();
	}

Whereas this is fast:

glBegin (GL_QUADS);

for (int i = 0; i < count; i++)
{
	glVertex2f (..);
	glVertex2f (..);
	glVertex2f (..);
	glVertex2f (..);
}

glEnd ();

You probably should also bind VAO 0 and GL_ARRAY_BUFFER 0 if using immediate mode; it’s not (I believe) stricttly-required, but it does make for cleaner code and cleaner state.

The take-home from all of this is that compatibility-profile OpenGL has quite a rich set of drawing commands, and you can freely choose among them to suit your own requirement. You do need to exercise a little care about how you use them, and while core-profile purists may be dismissive of them (and may home in on them as causes of issues in any code that does use them), they are nonetheless useful. They do have a reputation for being slow, but they’re not that slow, particularly if used carefully and properly.

Core-profile OpenGL, of course, offers a single code-path for everything that drivers can optimize around, and is more likely to be forward-compatible with newer features and extensions.

Pretty sure this isn’t correct. In a compatibility profile, you just bind VAO handle 0 (which activates the default vertex array) and queue your client arrays batch.

However, I believe it would be correct to say that you can’t build a VAO for a client arrays batch.

I’m probably just not being sufficiently clear, but my intent is essentially what you said: a non-zero VAO requires buffer-sourced vertex data.

VAOs predate VBOs, so they certainly can’t require them.

Now that’s definitely wrong, because VBOs were introduced in the GL_ARB_vertex_buffer_object extension and OpenGL 1.5, which date to 2003, whereas GL_ARB_vertex_array_object dates to 2008.

Last time I looked 2003 was earlier than 2008, so no - VBOs are the earlier functionality and VAOs can certainly require them.

1 Like

ARB_vertex_array_object references APPLE_vertex_array_object, the most recent revision of which is dated 2002.

According to ARB_vertex_array_object , glBindVertexArray works with both ARB and APPLE VAOs on implementations which support both, and the APPLE flavour only supports client-side vertex arrays (both fixed-function and generic).

Interesting, but APPLE_vertex_array_object does not capture the state for generic vertex attributes. So you couldn’t use them with VBOs; not without some change to VBO to include themselves into APPLE_VAO’s tables.

But this is all academic anyway; it is wrong to say that VAOs can’t be used with client-side arrays.

ARB_VAO, OpenGL 3.0, and all compatibility GL profiles specifically include client-side pointer data in its VAO state. And not just the built-in ones. Among the state included in the state table entitled “Vertex Array Object State” is:

  • GL_VERTEX_ARRAY: the enable/disable state set by glEnable/DisableClientState(GL_VERTEX_ARRAY).

  • GL_VERTEX_ARRAY_POINTER: The “pointer” set by calls to glVertexArrayPointer.

  • GL_VERTEX_ARRAY_BUFFER_BINDING: The value set when calling glVertexArrayPointer, specifying which buffer object was bound to GL_ARRAY_BUFFER at the time that call was made.

That is 100% of the information needed to tell whether the gl_Vertex attribute comes from a buffer object or a client-side pointer. And all of it is part of the VAO.

Similar state is stored for the generic and non-generic attributes too.

Sweet. Thanks for the correction. I’d forgotten that detail.

(too many years since using VAOs, since NVIDIA bindless perf beats VAOs for VBO batches, and less point in trying with client arrays.)

https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_vertex_array_object.txt

This extension differs from GL_APPLE_vertex_array_object in that client memory cannot be accessed through a non-zero vertex array object…

An INVALID_OPERATION error is generated if any of the *Pointer commands specifying the location and organization of vertex data are called while a non-zero vertex array object is bound, zero is bound to the ARRAY_BUFFER buffer object, and the pointer is not NULL… This error makes it impossible to create a vertex array object containing client array pointers.

https://www.khronos.org/registry/OpenGL/specs/gl/glspec46.compatibility.pdf

An INVALID_OPERATION error is generated if a non-zero vertex array object is bound, no buffer is bound to ARRAY_BUFFER, and pointer is not NULL.

One could get pedantic and say that VAO 0 can contain client memory pointers, but the intent of this specification is clear.

(If you really wanted to be pedantic, you could say that this is an artificial restriction in the GL spec, that the state data to allow client arrays in VAOs is not prevented by any of the above, and then go to town digging your heels in on semantics of that point.)

1 Like