Processor specific math operations

Csiki · January 4, 2004, 5:33am

Originally posted by zeckensack:
[b]Pedestrian polymorphism:[quote]

//header:
//these are implemented in a separate assembly module
extern "C"
{
  void __stdcall x86_execute_vertex_op_chain(ubyte** target,
    const Vertex** sources,
    uint vertex_count,
    const VertexOpChain* op_chain);

  void __stdcall AMD_execute_vertex_op_chain(ubyte** target,
    const Vertex** sources,
    uint vertex_count,
    const VertexOpChain* op_chain);

  void __stdcall SSE_execute_vertex_op_chain(ubyte** target,
    const Vertex** sources,
    uint vertex_count,
    const VertexOpChain* op_chain);
}

//global:
void (__stdcall* GeometryPipe::convert_verts)(ubyte** target,
  const Vertex** src,
  uint count,
  const VertexOpChain*)=plain_C_execute_vertex_op_chain;

//init code:
if (cpu.got_3dnow())
{
  convert_verts=AMD_execute_vertex_op_chain;
}
else
if (cpu.got_SSE()&&(config.allow_sse))
{
  convert_verts=SSE_execute_vertex_op_chain;
}
else
{
  convert_verts=x86_execute_vertex_op_chain;
}

The above assumes that the target is an x86. Well, you can easily extend that to other architectures, too, if you have an assembler for the platform. Otherwise just use the plain_C_fallback implementation.

And most important of all, use NASM for every single piece of x86 assembly code you’re ever going to write. NASM can produce object files linkable with all compilers known to man.[/b][/QUOTE]

This is exactly what the DirectX do.
NASM: see my reply above.

DJSnow · January 4, 2004, 6:06am

@csiki:

>>Platform = x86 + windows
as you can see: my argumentation resulted out of our very different view of the terms “platform” and “computersystem”; but, i will keep my definition, though (i think it fits it clearer than yours)

>>I don’t joke when I write PowerPC,
>>Athlon64 etc. not x86 is the world.
i didn’d doubt that; everyone can see, that lots of other “computersystems” ( ) are rising, which have specific instructions.

>>then you unable to use the code analyzers.
why do you have to use the codeanalyst; the time when i tried the 3Dnow SDK, it was nothing else than a “potential better profiler especially for AMD cpu’s” - yes, you could really do some fine-tunning with it - but to integrate the libraries and classes, out of the 3Dnow SDK (for example the vector/matrix code), i mustn’t use it ?!
i inserted easily (yes, it was a hack) the files out of the SDK, drove some tests, brought the results to our leader and i was finished. And beeing on MSVC6 we had no problems, as you can guess.

Csiki · January 4, 2004, 7:27am

Originally posted by DJSnow:
@csiki:
>>then you unable to use the code analyzers.
why do you have to use the codeanalyst; the time when i tried the 3Dnow SDK, it was nothing else than a “potential better profiler especially for AMD cpu’s” - yes, you could really do some fine-tunning with it - but to integrate the libraries and classes, out of the 3Dnow SDK (for example the vector/matrix code), i mustn’t use it ?!
i inserted easily (yes, it was a hack) the files out of the SDK, drove some tests, brought the results to our leader and i was finished. And beeing on MSVC6 we had no problems, as you can guess.

Using AMD Codeanalyst is totally independent from the 3dnow!.
I use it to see what are the bottlenecks in my apps.

DJSnow · January 4, 2004, 5:13pm

@csiki:
>>Codeanalyst is totally independent from
>>the 3dnow
as i said above: “…potential better profiler especially for AMD cpu’s…”
hence you know, why i mustn’t use it
and yes, i wasn’t interested in finding bottlenecks - it was a pure research purpose to find out, how much speed you can gain by using processor specifig instructions.