Heavy shader, can be optimized by splitting quad?

Hello!

I’m doing a space simulator where I draw the planet, then the atmosphere is drawn with a full-screen quad, with a shader that does ray-casting to the planet and atmosphere spheres calculating light scattering and fog (it’s not very realistic though, i’ve used my own formulae)

My question is: if instead of drawing a full-screen quad I make, say, 8 smaller quads to cover the screen, will it run faster? The idea is to paralelize the atmosphere drawing so the GPU can use different ‘drawing units’ at the same time (if this has any sense)

Thanks

Well I’ve done it and no speed-up, so i guess my idea was bad :frowning:

I’ve tested with 4x2 quads, 8x4 and 16x8 quads and my framerate is the same, about 38 fps. Without drawing the atmosphere the framerate is 200.

Edit: using Nvidia 8600GT

Even on a single triangle, the fragment shading occurs in parallel.

What could offer a speed up is to restrict the surface, instead of covering the full screen, draw a coarse disk (an octogon should do) that covers the planet and its atmosphere.

And any optimization on the fragment shader will have a big impact too.

  • Optimize the shader if possible(early-exit condition based on distance to planet center?).
  • Otherwise render the raycasting Quad to a texture with smaller(half) screen resolution. Then draw that texture enlarged into the viewport.
  • In general, the fastest way to render a fullscreen pass is to render a single triangle, which is clipped to the viewport. (Reason: The hardware rasterizes quads with 2x2 pixels, which are only partially occupied at the diagonal when rendering two triangles).

As Zbuffer said, any optimization of the fragment shader will have a big impact. If it isn’t possible to do this, and your calculation is still very expensive, then maybe you could try switching to using quite a few more polygons + performing the heavy calculation per-vertex instead of per-pixel.

You could form the atmosphere from a disk with many sectors + concentric rings, and then only calculate the atmosphere color per-vertex and allow the GPU to interpolate these values.

Eg. If you form a disk with 60 sectors + 14 rings, you would have 1620 triangles (60*(1 + 132)), using 841 unique vertices (1460+1). You would have to perform the heavy calculation 841 times whatever the size of the atmosphere, whereas when performed per-pixel you would only get up to a 29x29 (=841) pixel atmosphere before needing to perform this heavy calculation more times than per-vertex method.

Of course more triangles will be slower to render than a quad, but triangle-count probably isn’t a limiting factor at the moment (the number of times you perform your heavy calculation is), and image quality will be lower with too few triangles forming the disk, so you’d need to find a balance.

Even on a single triangle, the fragment shading occurs in parallel.

Good to know it!

The performance is good when the camera is far from the planet (most rays fall into Deep Space and the fragments are early discarded)

The framerate drops when you are near, for example landed in the planet. I can’t restrict the geometry (octogon) or do high tessellation to a disk in this case.

But as Dan Barlett says, I could tesselate the full-screen quad (and change computation from the fragment to the vertex shader), though I would have interpolation artifacts unless i tesselate a lot. For the most part of the atmosphere there is a soft gradient, but the sun(s) have a glow when viewed from inside the atmosphere, and when the planet is viewed from far away, the atmosphere silhouette is thin and I would have artifacts in these cases.

The rendering to a less-resolution texture is also a good idea, it would be similar to tesselate the quad.

Edit: Oh and yes i must optimize the shader somehow.

Thanks to all!

You can post it here, so that we’ll have a lot of fun optimizing it and helping you at the same time :wink:
This is GLSL thread after all.

Ok! Here are the shaders.

Vertex shader:

/*
  Shader for rendering a planet atmosphere.

*/

// Interpolated position in eye coordinates
varying vec4 eyePosition;

void main()
{

  // Compute position in eye coordinates
  eyePosition = gl_ModelViewMatrix * gl_Vertex;

  // Obtain projected position
  gl_Position = ftransform();
	
}
        

The fragment shader:

/*
  Shader for rendering a planet atmosphere.

  All lights are assumed to be positional. Maximum number of lights taken into account is 2
  Lights are treated as directional because they are assumed to be far away, so the light position
  is treated as a direction since they are in camera space. ( But the position must be normalized)


  Uniforms:
	numLights - The number of (positional) lights activated in OpenGL. GL_LIGHT0 through GL_LIGHT(numLights-1)
	atmosphereAltitude - Altitude at wich atmosphere density becomes very small
	atmosphericPressure - Atmosphere pressure at 0 altitude
	atmosphericScale - Atmospheric pressure constant
	atmosphereColor - atmosphere diffused light color
	planetRadius - Planet radius
	planetEyePosition - Planet position in eye coordinates
*/

const int NUM_SAMPLES_ATMOSPHERE = 5;

const float E = 2.718281828;

//const float ATMOSPHERE_DENSITY = 0.002;
const float ATMOSPHERE_DENSITY = 0.01;

const vec3 REDDISH_ATMOSPHERE = vec3( 0.7, 0.2, 0.1 );

// Number of active lights (set by application)
uniform int numLights;

// Atmosphere altitude
uniform float atmosphereAltitude;

// Atmospheric pressure at surface level (units = terrestrial pressures)
uniform float atmosphericPressure;

// atmosphericScale is -g / H (g: surface gravity in earths gravities, H = atmosphere height scale)
uniform float atmosphericScale;

// Atmosphere color
uniform vec3 atmosphereColor;

// Planet radius
uniform float planetRadius;

// Planet position in eye coordinates
uniform vec3 planetEyePosition;

// Interpolated position of vertex in eye coordinates
varying vec4 eyePosition;

// Computes ray-sphere intersection.
// rayDir must be unitary.
// Returns the ray parameters for first and second intersection in that order.
// Only positive intersections (parameter > 0) are computed.
// If the first is < 0, there is no intersection.
// Else, if the second is < 0, there is only one intersection.
vec2 raySphereIntersection( vec3 rayOrigin, vec3 rayDir, vec3 spherePos, float sphereRadius ) {

  vec3 centerToOrigin = rayOrigin - spherePos;
  float b = 2.0 * ( dot( rayDir, centerToOrigin ) );
  float c = dot( centerToOrigin, centerToOrigin ) - sphereRadius * sphereRadius;
  float d = b * b - 4.0 * c;
  if ( d < 0.0 ) {
    // No interesections
    return vec2( -1.0, -1.0 );
  }
  d = sqrt( d );
  float t1 = 0.5 * ( -b - d );
  float t2 = 0.5 * ( -b + d );
  if ( t1 < 0.0 ) {
    return vec2( t2, -1.0 );
  }
  return vec2( t1, t2 );
}

void main()
{
  vec3 rayDir = normalize( eyePosition.xyz );

  vec2 tAtm = raySphereIntersection( vec3( 0.0 ), rayDir, planetEyePosition, planetRadius + atmosphereAltitude );

  if ( tAtm.x < 0.0 ) {
    // Ray lies outside of planet and atmosphere
    discard;
  }

  vec2 tPlanet = raySphereIntersection( vec3( 0.0 ), rayDir, planetEyePosition, planetRadius );

  // ti and t2 are the ray parameters for initial and end points of the ray inside the atmosphere
  float t1 = 0.0;
  float t2 = 0.0;
  float rayTouchesOnlySky = 0.0;
  if ( tPlanet.x < 0.0 ) {
    // The ray doesn't intersect the planet.
    if ( tAtm.y > 0.0 ) {
      // The ray enters and exits the atmosphere without touching the planet
      t1 = tAtm.x;
      t2 = tAtm.y;
    }
    else {
      // The camera is inside the atmosphere and the ray exits it without touching the planet.
      t1 = 0.0;
      t2 = tAtm.x;
      rayTouchesOnlySky = 1.0;
    }
  }
  else {

    // The ray touches the planet.
    if ( tAtm.y > 0.0 ) {
      // The ray enters the atmosphere and then touches the planet.
      t1 = tAtm.x;
      t2 = tPlanet.x;
    }
    else {
      // The camera is inside the atmosphere and the ray touches the planet.
      t1 = 0.0;
      t2 = tPlanet.x;
    }
  }

  float factorHeight = 1.0 / ( atmosphereAltitude );
  float planetEyePositionLength = length( planetEyePosition );
  float cameraHeight = planetEyePositionLength - planetRadius;
  //float density;
  float densityIlluminated;
  float alpha;

  // Sampling of atmosphere
  densityIlluminated = 0.0;
  float dt = ( t2 - t1 ) / float( NUM_SAMPLES_ATMOSPHERE );
  vec3 sampleIncrem = rayDir * dt;
  vec3 samplePos = rayDir * t1 + sampleIncrem * 0.5;
  for ( int i = 0; i < NUM_SAMPLES_ATMOSPHERE; i++ ) {

    vec3 planet2sample = samplePos - planetEyePosition;

    // Compute air density
    float sampleHeight = max( 0.0, length( planet2sample ) - planetRadius );

    // Compute illuminated air density
    vec3 normal = normalize( planet2sample );
    float dotLightNormal = dot( normal, normalize( gl_LightSource[ 0 ].position.xyz /*- samplePos*/ ) );
    if ( dotLightNormal > -0.0 ) {

      float d = exp( atmosphericScale * sampleHeight ) * dt;

      densityIlluminated += d * dotLightNormal;
    }

    // Increment sample position
    samplePos += sampleIncrem;
  }

  densityIlluminated *= ATMOSPHERE_DENSITY * atmosphericPressure;
  alpha = densityIlluminated;

  vec3 lightDir = normalize( gl_LightSource[ 0 ].position.xyz );
  float dotLightNormal = max( 0.0, - dot( planetEyePosition, lightDir ) / planetEyePositionLength );

  if ( cameraHeight < atmosphereAltitude ) {

    // Camera is inside the atmosphere, add light diffusion
    float d = exp( atmosphericScale * cameraHeight );
    densityIlluminated = min( 1.0, max( densityIlluminated, rayTouchesOnlySky * sqrt( dotLightNormal ) * d ) );
    alpha = densityIlluminated;
  }

  // Computes 'redding' of color due to angle to the sun and planet normal

  float dotLightRayRaw = dot( rayDir, lightDir );
  float dotLightRay = max( 0.0, dotLightRayRaw );
  float dotRayNormal = max( 0.0, - dot( planetEyePosition, rayDir ) / planetEyePositionLength );
  float redding = pow( dotLightRay, 50.0 ) + ( 0.5 + 0.5 * dotLightRayRaw ) * pow( 1.0 - dotRayNormal, 2.0 );
  // Computes sun color
  vec3 sunColor = mix( gl_LightSource[ 0 ].diffuse.xyz, REDDISH_ATMOSPHERE, 1.0 - dotLightNormal );

  if ( numLights > 1 ) {
    // Only two suns contribute diffused light
    lightDir = normalize( gl_LightSource[ 1 ].position.xyz );
    dotLightNormal = max( 0.0, - dot( planetEyePosition, lightDir ) / planetEyePositionLength );

    dotLightRayRaw = dot( rayDir, lightDir );
    dotLightRay = max( 0.0, dotLightRayRaw );
    dotRayNormal = max( 0.0, - dot( planetEyePosition, rayDir ) / planetEyePositionLength );

    float redding1 = pow( dotLightRay, 50.0 ) + ( 0.5 + 0.5 * dotLightRayRaw ) * pow( 1.0 - dotRayNormal, 2.0 );
    vec3 sunColor1 = mix( gl_LightSource[ 1 ].diffuse.xyz, REDDISH_ATMOSPHERE, 1.0 - dotLightNormal );

    redding = 0.5 * ( redding + redding1);
    sunColor = mix( sunColor, sunColor1, redding );   
  }


  // Store color with alpha
  gl_FragColor = vec4( mix( atmosphereColor, sunColor, redding ), alpha );


}

Comments:

I use GLSL 1.2, but I’m planning to convert the engine to GL 3.3/GLSL 1.5

I use only 2 lights for performance, though i can have 4 suns in a solar system, only 2 of them (close binary) affect a planet.

I use a camera-centered world due to numeric precision. So camera is at origin (0,0,0) and planet is at planetEyePosition. So rays go from origin to eyePosition, wich is the fragment position on the quad.
The units in planet drawing are kilometers, the units for object drawing are meters.

Have fun!

Btw you can visit my blog and see some pictures and a video (sorry, no pictures of the atmosphere. I’m going to make a post in the blog with pictures for you)

The blog url is below.

I’ve just posted in the blog some pictures of the atmosphere, the link is below.

Err, maybe I’m dumb, but I see no links :frowning:

It’s in my signature… doesn’t it show up?

Well here it is:
http://antaresgame.blogspot.com/

It might be that signatures display is turned off in my forum settings.
Nice project you have!

Though, there is an Infinity project that aims for similar goals and much more mature (I’m sure you are aware of it):
http://www.infinity-universe.com

About your shader: there are tons of ‘if’ instructions there. The driver can convert it to non-branch variants (max,min,clamp,etc), but its not this smart all the time in practice. So I suggest doing it on your own.

To be continued…

Nice project you have!

Thanks!

Though, there is an Infinity project

Yeah I know it. It’s multiplayer and the engine is astounding. He is developing it for a lot of years! Two or three days ago he uploaded a new video in his blog:
http://www.gamedev.net/community/forums/mod/journal/journal.asp?jn=263350&reply_id=3643238

I hope I will finish the game in less time :stuck_out_tongue:
But it’s difficult to say, since I don’t know exactly how I want the game to be. Initially I wanted a Frontier-like game, though some people has told me some completely different ideas. For the moment I think I will do the space combat simulator part.

About the shader, yes you’re right! For example, the if inside the for loop can be optimized with a min function between 0.0 and the variable ‘d’. It is a good optimization because its inside the for loop!
I’m going to try it right now, to see if the fps increases.

Well, from 37-38 to 38-39 fps. It’s something :frowning: