Max of texture access

hello

I’m writing a fragment shader in GLSL and after 4 access to a texture my shader turn to software execution. I had a look to GLInfo2 (delphi3d.net) and found that “Max. native texture indirections” was the only texture parameter with “4” as a value.

Does this parameter really describe the maximum texture access ? So what is this parameter “texture instructions”, which is set with a higher value ?

4 access to texture seems extremely low to me (I have a Radeon 9800) and even looking at x800, the “Max. native texture indirections” parameter is always 4 !

Whereas NVidia’s first Geforce FX (5200) has 1024 “Max. native texture indirections” ! Does it mean we can access texture 1024 times in a fragment shader ? This is a lot more than ATI !

I’m wondering if I buy a Geforce FX this afternoon… Will I actually have theses 1024 access to texture ?

There is no limit on number of texture accesses or indirections with NVIDIA’s hardware other than the number of maximum instructons.
Check this thread:
http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=012439
or search the forum for “indirections”.
Instead of a GF FX you might want to look for a GF 6xxx which supports even more GLSL features like dynamic loops in fragment programs and texture access in vertex shaders.

I think that the number of indirections doubles as the max number of times that you may access the same texture. Or, more specifically, accessing the same texture counts as an indirection on that texture.

And I think, specifically, that when you say “that same texture” you mean “that same sampler.” Thus, if you want to read from the same texture image more times, you can bind it to more than one sampler.

I’ve bought a geforce FX 5200 and for its price (60€) it’s a real pleasure to program it… nVidia is far more advanced in term of fragment shader than ATI !

As described in the ARB_fragment_program spec, the Radeons group fragment programs/shaders into “nodes”. A node consists of zero or more texture accesses, followed by one or more ALU instructions. Now the drivers can reorder texture accesses to avoid the creation of additional nodes. However, whenever you use the output of one texture access directly or indirectly as the input of another texture access, a new node must be created because of hardware limitations. That’s also called an “indirection” for obvious reasons.

As soon as you run out of nodes (and the R300+ only support four), the shader cannot be executed in hardware.

If your texture accesses aren’t really dependent and could be executed in parallel, you might try updating your driver - I guess optimization techniques still have to mature.
Also, if you sample the same texture more than once, you could try binding it to more than one texture sampler, as another poster has suggested before.

Originally posted by Korval:
I think that the number of indirections doubles as the max number of times that you may access the same texture. Or, more specifically, accessing the same texture counts as an indirection on that texture.
I’d have to disagree there. I wrote a GLSL shader last week which made 9 accesses to the same sampler (each access was jitted slightly from the main coord by a vec2()) and it compiled and ran in hardware just fine on my 9800xt with the most recent drivers (i checked the output log to make sure it was running in hardware as what I was doing was crashing the frame rate anyways).

I’d be interested to see how you accessed your texture. Because even accessing the same texture, after 4 reads on my 9800se everything turn software.

If you take a look at humus last demo, you can see that he does 13 texture reads (11 dependent) per fragment – and it’s running in hardware on my 9700 Pro.

texture indirection != texture reads

Indeed. Those 11 samples belong to the same texture indirection. The base texture and bumpmap belong to another one. A total of two indirections.

Well, maybe I’m not doing something right.

here’s a simple fragment shader to illustrate my problem (happens on my radeon 9800se and radeon 9600 pro)

uniform sampler2D tex;

void main(void)
{
vec2 test=vec2(gl_TexCoord[0]);
vec4 value;

//first read
value=texture2D(tex, test);
test+=value.rg*.01;
//second read
value=texture2D(tex, test);
test+=value.rg*.01;
//third read
value=texture2D(tex, test);
test+=value.rg*.01;
//fourth read
value=texture2D(tex, test);
test+=value.rg*.01;
//fifth read: turn to software
value=texture2D(tex, test);
test+=value.rg*.01;

gl_FragColor=value;

}

When the same texture is read more than 4 times, everything turn software.

  
varying vec2 TexCoord;
uniform sampler2D src;  // texture to sample
uniform mat3 filter;	// filter to use

void main()
{
vec4 finalcolor = texture2D(src,TexCoord) * filter[1][1];
finalcolor += texture2D(src,TexCoord+vec2(-0.1,-0.1)) * filter[0][0];
finalcolor += texture2D(src,TexCoord+vec2(0,-0.1)) * filter[1][0];
finalcolor += texture2D(src,TexCoord+vec2(0.1,-0.1)) * filter[2][0];
finalcolor += texture2D(src,TexCoord+vec2(-0.1,0)) * filter[0][1];
finalcolor += texture2D(src,TexCoord+vec2(0.1,0)) * filter[2][1];
finalcolor += texture2D(src,TexCoord+vec2(-0.1,0.1)) * filter[0][2];
finalcolor += texture2D(src,TexCoord+vec2(0,0.1)) * filter[1][2];
finalcolor += texture2D(src,TexCoord+vec2(0.1,0.1)) * filter[2][2];
gl_FragColor = finalcolor * gl_Color;
}

thats what I used, runs in hardware, i’m guessing that the reason it runs in software is down to how you adjust the texture coords

The problem is that in my fragment shader every texture read has its coordinates based on the result from the previous texture read (and it cannot be otherwise for what I’m doing).
No way to run this in hardware on ATI ?

I’m afraid not. You’ll have to go multipass to get around the problem.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.