Using MMX functions; how to ?

Does anybody know how to call mmx-functions from a high-level language? I need a way to multiply matrices fast. Is there some kind of dll which passes functions to mmx-hardware ?

thanks,

Edo

Intel has something called the Math Kernel Libraries (MKL) which contain various matrix and vector functions (a subset of BLAS and LAPACK, if you’ve heard of them) which have been optimized for Intel processors. There is a bit of a learning curve with it, but I have used the MKL in some research apps with good success. You can download the MKL here:
http://support.intel.com/support/performancetools/libraries/mkl/index.htm

I also recently noticed the Intel Small Matrix Library (SML), which only handles small matrices (so it doesn’t help me ), but it’s a C++ implementation, so it might be a lot easier to use. It’s here:
http://developer.intel.com/vtune/compilers/cpp/matrix_lib.htm

Hehe, you might as well also look into using 3DNow! to do such things (if it is available). Check out AMD’s 3DNow! math and utility library and code samples here http://www.amd.com/devconn/3dsdk/downloads/library.zip

If I use these libraries, then I cannot or have difficulties do my rendering with OpenGL.
Am I right?

If I use these libraries, then I have to write all the transformations myself, am I right?
Can these libraries be used together with openGL ?

The video card’s OpenGL driver is likely already optimized for MMX, 3DNow(2)!, and SSE. But many 3D apps need to do additional math for culling, collisions, shadow volumes, and so on. It is these types of things that are prime canidates for those libraries above. In certain situations you may have to do your own object to world and world to view transforms, particularly when neededing to depth sort transparent polygons. These libraries would also be helpful in those situations (finding the depth of the polygons not the sorting itself of course).

Does it it speed up your program significantly if you use the MMX, SSE, 3DNow! etc. extensions?

It can be, but just how significant is dependent on the CPU’s load. That is, if your app is heavy on the rendering side (fill rate limited), then you’ll see some improvement, though not likely very significant. If it is heavy on the math side (CPU limited), then it can be a very significant boost in speed.

[This message has been edited by DFrey (edited 09-19-2000).]

For those with MSVC++ 6.0, you might want to get the processor pack for it from MSDN. It adds intrinsic support for 3DNow!, 3DNow! 2, SSE, SSE 2, and of course MMX. It also installs MASM 6.15, which is nice to have eh?
One thing though it requires SP4 for VS6.
Intel even provided a very snazzy C++ abstraction of the SSE instructions, so it is super simple to build an SSE optimized math library. I didn’t notice anything similiar from AMD though. Although again, AMD directly does offer prebuilt math and engineering libs and source code.

Here is the link to the processor pack: http://msdn.microsoft.com/vstudio/downloads/ppack

[This message has been edited by DFrey (edited 09-19-2000).]

Thanks DFrey! That’s good info.

Is anyone aware of any real-world before and after benchmarks of code compiled using the processor pack? Just curious.

Ummm, that’s going to be dependent on how you implement the intrinsics. The compiler doesn’t optimize your code with enhanced CPU instructions. You have to provide that code. The processor pack just makes it easier to do that.

Ah. I just assumed it would be another compiler option.

There I go showing my ignorance in public again