ALU operation - Texture Operation

Hi everyone!

I’d like to know if someone remember about a paper telling that current trend in GPU programming is more to use ALU operation than Texture read for specific function because it could be faster in some way? (for example, today, it would be slower to use a renormalization texture than renormalization code in the shader)

I remember a GPU Gem 3 article but i haven’t the time to read it entirely again :wink:

Thanks for your help.

I don’t have it with me at the moment, but GPU Gems 2 contains one or two articles mentioning that as well.
One is the chapter about general GPU optimisation, the other place where I believe it is mentioned is the chapter about “improved perlin noise” using just shader code.

The reasoning is that with increasing communication times (measured in clock ticks, which are getting shorter), the length of communication pipelines is increasing compared to the time it takes to calculate something. Most likely you’ll find a paper like that on the nvidia site though.

Thank you T101. I’ll take a look in my GPU gems2 book this afternoon.

I have found that the ALU/texture_read ratio was 6:1 in 2003
http://ati.amd.com/developer/SwedenTechDay/03_Clever_Shader_Tricks.pdf

Is it quit the same thing today? or do the ratio is higher today?

Don’t look at me - I don’t have any hard figures.

But I’d guess that nvidia’s statements about communication versus calculation at least apply to the current nvidia architecture (even though the 6800 was new at the time), and presumably they’ll apply to ati as well.

And they seem like reasonable assumptions in general terms anyway.
The specific mix might be different depending on GPU memory speed, cache, number of shader units, and probably some innate differences between ati and nvidia architectures as well.

ok, I agree. That’s what I thought.

The 6:1 figure is a recommendation for where to aim for best performance. For X1900 the hardware ratio is 3:1, but that increases with many factors, such as going to trilinear, anisotropic, 64 or 128bit formats etc. For HD 2900 the hardware ratio is more like 5:1 (depending on how you count), and the recommendation might be something like 10:1 for a typical texture mix.

Generally speaking, before considering a texture lookup optimization, make sure it’s worthwhile. Texture normalization by cubemaps are obsolete. That’s replacing only a maximum of three ALU instructions anyway.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.