Just out of curiosity, would it be worthwhile to precompute everything before you send it to the renderer. Not just building the modelview and projection matrices then passing them GL and letting it do all the work on all of the vertices.
Imagine this: with the pipeline you suggest, you create the modelview matrix (M), the projection matrix § and 30,000 vertices (v to v). So the rendering goes roughly like: for(i=1;i<=30000;i++) Mv[i]; for(i=1;i<=30000;i++) Pv[i]; 60,000 multiplications, all done in GL.
What if you created a single transformation for the entire model (T), that described both the modelview and projection transformations. Then your rendering would be: for(i=1;i<=30000;i++) T*v[i]; 30,000 multiplications. Now if you wrote your own superfast matrix multiplication routine in asm or whatever, the calculations are all done in less than half the time.
Is there a reason to not do this? Or am I missing something (never had to deal with an API to make 3D graphics before, I probably just need a paradigm shift.)