When to and when not to use Instancing

When does instanced rendering actually pay off compared to regular indexed rendering? Does it depend on the size of the mesh (vertex number and additional per instance data (textures, materials, matrices)) or on number of instances?

I am more and more getting the impression that i am having some misconceptions about instancing.

I originally wanted to construct every model in my Engine from multiple instances of the same mesh. I´d use matrices from joints of a skeleton to render for example multiple cubes to form a body.
But now i´ve decided, that i would like to be able to have a bit more flexibility. One joint of a skeleton will be reference for only a few vertices (most of the times 4) and using the indices i want to connect 3 vertices to a face which all might use a different matrix (the matrix of their reference joint in the model skeleton). This way models will be smoother and and it allows way more natural animations.
However, to use this technique, i would have to store one whole body model alltogether to be able to index the vertices of other joints. But will instancing still be worth it then? If one human only uses one mesh and ~17 animation matrices and i only draw this human say 50 times in a scene i would have to send 501716 = 13600 floats (!) to the pipeline(put all matrices into a buffer and bind it when calling glDrawElementsInstanced) to be able to render 50 differently animated humans with one drawcall.
Whereas, if i did the matrix animation calculations on the cpu and would then only send an indexed single mesh with just one matrix to the shader pipeline, i would have to do all 50 drawcalls for 50 humans but every package will be a lot smaller. What would be the best way to do it? is there something i misunderstood?

Instancing lets you generate m*n vertices with only O(m+n) elements in your attribute arrays. The biggest saving is when both m and n are large.

If you don’t need instancing, you can avoid the overhead of multiple draw calls by using glMultiDrawElements().

In your example, the difference between instanced and non-instanced rendering is that, without instancing you’d be having to send 50 sets of vertex coordinates to the GPU, whereas indexing lets you use one set 50 times. I’d assume that you won’t be able to send the animation matrices as instanced attributes because you’d exceed the GL_MAX_VERTEX_ATTRIBS limit (which is only required to be at least 16).