fp40 profile and Cg Optimization

Hi all,

Does anyone know where we can find some resource
on Cg programming with fp40 profile? Especially
the dynamic array feature in fp40. I’ve searched
the latest Cg toolkit downloaded from NV, but
nothing is out there (not even an example).

In addition, other than the nine steps mentioned
in Appendix C of Cg user manual, is there any
other useful technique to optimze Cg fragment
program?

Thanks so much in advance.

Philip

By dynamic arrays I assume you are referring to the unsized array support in Cg. This is a compile-time feature (along with interfaces) and does not require the fp40 profile. It is documented in the Cg User Manual Addendum, and there’s also a chapter in the GPU Gems book:

http://developer.nvidia.com/object/gpu_gems_home.html

Our GPU programming guide provides more advice on shader optimization:
http://developer.nvidia.com/object/gpu_programming_guide.html

Is the dynamic array compile time feature fast enough to be used at runtime and to avoid creating multiple shaders or is it just a more elegant version of using macros ?

Thanks Simon.
Sorry for the confusion.
Unfortunately, I meant non-constant
array indexing. My case is:

// I have 12 const matrices here
const float3x3 mat[12] = { … } ;

i = lookup from a texture
vec = mul ( mat[i] , vec ) ;

Is it doable in Cg or GLSL nowadays
(remark: it seems not to me)? Or in
the near future?

If we use 3 “h3texRECT” calls (to get
the matrices from textures), it is
quite expensive.

Thanks a lot for your suggestions.

Philip

Originally posted by simongreen:
[b]By dynamic arrays I assume you are referring to the unsized array support in Cg. This is a compile-time feature (along with interfaces) and does not require the fp40 profile. It is documented in the Cg User Manual Addendum, and there’s also a chapter in the GPU Gems book:

http://developer.nvidia.com/object/gpu_gems_home.html

Our GPU programming guide provides more advice on shader optimization:
http://developer.nvidia.com/object/gpu_programming_guide.html [/b]

Yes, you can’t index into constant memory in fragment programs on any current hardware (you can in vertex programs). You have to use textures instead, the performance for this should not be bad with current hardware.