Hi, I’m developing skinning on a GPU and I’m getting strange performance results. The bottleneck of my application is in the vertex shader. Thus, I switched from display-list embedded GL_TRIANGLES, which is sending three vertices per triangle, to an indexed geometry. However, glDrawElements-based implementation runs at about the same speed as the display lists, even though it needs to transform about 6 times less vertices. The bottleneck is still in the vertex shader – if I use simpler vertex shader, I get a significant speed up. Can anyone explain this? Or suggest how to do hardware skinning in OpenGL efficiently? Thanks a lot!
You are not limited by vertex bandwidth and hence the same performance results from an algorithm that better utilizes the vertex bandwidth (glDrawElements) and the one that doesn’t. You are limited by vertex throughput, which is caused by your heavy vertex shader.
As for optimizing, i cannot comment on that since i don’t know your implementation.
Heh, speaking of limited throughput, I just was getting crazy why my engine runs at ~10 fps while rendering ~200 polys with a heavy shader (50 times so I could simulate a more complex map) on my GF6600GT yesterday until I realized that I was indeed limited by pixel shader throughput because I was redrawing the entire screen with a resolution of 800x600 (~0.5 Mpixels, having a shader with a max throuhput of 266 Mpixels/s) DON’T do such a stupid thing
P.S. I know, it’s a little bit Ot