I’m looking for resources on optimizing code in C++. Not so much in the algorithm department, but more like being cache and branch prediction friendly. Can anyone point me to any websites or books that they have found very useful or have helped them out?
I know that this is not an OpenGL related question, however a lot of the people I have be interacting with here are very knowledgeable on the subject.
A few pointers:
Scott Meyers’ “More Effective C++” is an excellent book with a great chapter on efficiency.
For the ultimate in Object-Orientaed performance programming, learn how to do template meta-programming. http://www.oonumerics.org/blitz/ is a good place to start. His space-filling curves approach to stepping through large arrays is an innovative way to improve cache hits.
Kai C++ is a highly recommended optimizing compiler, from everything I’ve heard. It does a great job of reducing the overhead of C++.
[This message has been edited by Zeno (edited 02-05-2002).]
just an hint :
just optimize your really inner loops, nowadays compiler are able to produce quite well optimized code.
(Try to do some preformance testing on FPU’s SQRT and one SQRT generated by a C++ compiler and u’ll see the difference, compiler wins usually)
And also, optimize in ASM your “high frequency call” functions, like math, collisions, sorting.
just my 2-euro cents.
rIO: I found that using inline assembler can cause slower code (A whole day porting floating point instructions to AMD 3D Now - and the code performed worse that the original
I think I’ve read that the optimizer can’t optimize code well if it is mixed with inline assembler instructions (which sounds logical). Does anyone have experiences or hints concerning good inline assembler usage?
Thanks to jwatte,
Who pointed me to Intel’s developer website. This was what I was looking for, cache and branch prediction friendly info and not regurgitated theory on optimization. Thanks man.
I also wrote back to let anyone that doesn’t already know about Intel’s monthly article about game optimization. Here is the link to one of the articles:
I did download the +300 page ‘optimizing the Pentium processor’ spec; it was a lot more informative.