I have been experimenting around having all my various objects have their own dedicated shaders, each compiled/linked separately, each with their own unique program ID. Then when I render a scene with multiple objects, I call my Render() method for each object, which sets its own program ID to be the active program ID using glUseProgram(), then it renders itself. I have a couple of basic questions regarding the general scope of different program IDs.
Does each shader program ID have its own set of textures? That is, is GL_TEXTURE0 different for each shader program ID?
If I compile/link several different shader programs, each with its own primitives to render, each with its own textures, is the texture space separate and distinct from each other because they are different program IDs?
Is there a maximum number of program IDs I can use in a given OpenGL context?
Is there a performance hit for using many program IDs and then just changing the active program ID for rendering the various objects in a scene?
Each program has its own set of default-block uniforms, so associating sampler variables with texture units via glUniform1i() affects a specific program (the one bound at the point that glUniform1i() is called).
Realistically, no. It’s only limited by available memory.
Yes. Multiple draw calls with significant state changes between them is slower than multiple draw calls with similar state or a single draw call. Changing the program is about the most significant state change possible.
If you’re changing programs because e.g. one object has a diffuse map while another has a constant colour, it may be faster to use the same program and use a 1x1 diffuse map for the latter.
In general, the rendering process should be structured around what’s convenient for OpenGL rather than trying to coerce it into a “textbook OOP” structure. Don’t split rendering of objects into separate draw calls simply because they’re separate objects. Don’t split objects’ vertex data into separate VBOs simply because they’re separate objects. If it’s feasible (from OpenGL’s perspective) to render most of the scene in a single draw call, then do so; don’t split up rendering to fit some pre-conceived object structure. And don’t try to maximise flexibility without realising that it will have a performance cost.
That is exactly what I have done, but I see the wisdom in your counsel. When I started learning OpenGL several months ago, I used single shaders and one program ID. But over time I was trying out new things and my shader code keep growing and it quickly became a mess. I then did an OOP design and put the shader compile/link, vertex attributes, etc, in a base class so my concrete classes implement various objects. My application code run loop just calls the Render() method for each object. Each object uses its own program, binds its own VAO, then calls glDrawElements(). This is only a platform for my experimenting and learning OpenGL so I am only dealing with less than 10 objects at a time.
But how do large complex OpenGL applications deal with this? Do these apps really use a single draw call to render hundreds or thousands of objects? Are there any design patterns for OpenGL applications that manage hundreds/thousands of objects?
That’s certainly conceivable, although it’s not necessary to get the entire scene down to the absolute minimum number of draw calls. Certainly, they wouldn’t use one draw call per object for thousands of objects.
In short, you determine up front what types of objects will make up the scene, and which types are similar enough that they could share a draw call. Beyond that, it depends upon various factors. One of which is how static or dynamic the set of objects is. If the scene has a relatively fixed set of objects which is small enough to fit into video memory, you’d create and populate the various buffers during initialisation. You won’t necessarily draw everything every frame, but glMultiDrawElements() makes it fairly straightforward to draw an arbitrary subset of the objects. Each object might store a reference to the VAO containing the object’s data and the start/end indices which correspond to that particular object. The only data which needs to change each frame are the parameters to the draw calls and any uniforms. Rendering is then just an issue of determining which regions of each VAO to draw, and drawing them.
If you have objects which are dynamically created by the game logic, you’d need to allow for that. Either by allocating arrays capable of holding the worst-case number of objects, or by allocating an array for each “chunk” of objects, where the appropriate chunk size is determined empirically. Above a certain size, one large chunk won’t be noticeably faster than two smaller chunks, so it’s a balance between draw call overhead and efficient use of memory.
Probably the biggest issue is figuring out to handle per-object data (e.g. transformations). You can’t change uniforms in the middle of a draw call, but you can e.g. add an integer vertex attribute holding an object ID which is used as an index into a uniform array (or an array stored in a UBO or SSBO or texture). If you have many objects with identical topology, you can use instanced rendering.
For textures, you can have an array of sampler variables, but you’re limited by the number of texture units. Array textures allow you to effectively store multiple textures in one, using a single texture unit, but all layers have the same format, dimensions, and sampling parameters (filters and wrap mode). This is another area where you need to choose between flexibility and performance.