Small sprites - Favour fragment time over vertex time?

I would like to hear peoples experience with optimizing for GLSL, particularly when it comes to particles/sprites.

Can any performance be gained by saving OpenGL some vertex-processing (using 3 less vertices per sprite).

Are point-sprites considerably faster than quads?


I need to make a lot of sprites of small size (64 pixels) and I can choose between making them as QUADS or POINTS (i.e. point sprites). Only bad part is that I need to interpolate some data over the sprites (not texture-coordinates - a variable I use for an effect. The value-range is derived from a 32-bit texture which designates “range-categories”, E.g. 0==> 0…1, 1==>2…3).

Now, I can:

  1. Use point-sprites which makes the texture-coordinates the only varying entity over the fragment, but hardwired to have the range 0…1. I must do something like:
uniform baseX, stepX;
uniform baseY, stepY;

baseX = ...
baseY = ...
stepX = ...
stepY = ...

uniform baseX, stepX;
uniform baseY, stepY;

  i = baseX + gl_TexCoord0.s * stepX;
  j = baseY + multiTexCoord0.t * stepY;

This approach saves me three texture-lookups (and some data-movement with my vertex-buffers) in the vertex shader but has the penalty of 128 extra multiplications in the fragment shader (per sprite that is).

  1. Use quads and have GL automatically interpolate:

// figure out which corner we're in: 0,1,2,3
// Use gl_TexCoord for interpolation
if (corner == 0) {
  gl_TexCoord[0].s = ...
  gl_TexCoord[0].t = ...
else if (corner == 1) {

  i = gl_TexCoord0.s;
  j = multiTexCoord0.t;

This approach makes the fragment shader minimal but adds extra texture-lookups in the vertex shader (even worse, the texture-lookups will yield the same value for each lookup)

What is your experience when it comes to texture-lookups versus multiplications?

Any thoughts appreciated.
-Rene Jensen

I don’t get it - where are you using vertex shader texture lookups?

In the setup of baseN/stepN. Anyway I recently realized that you can only sample from a narrow range of texture-formats in the vertex shader, and none of them are suitable.

Ok, in short, this is a particle thing, and I need to upload as little data as possible. The best scheme I have come up is to use 32 bits per particle for X/Y location plus other parameters that determine it’s visual apperance

My solution is to use VBOs and glTexCoordPointer. Two 16bit (X,Y component) words are then used in the setup of the particle like this:

float Q1=gl_MultiTexCoord0.x
float Q2=gl_MultiTexCoord0.y
// Now we have 32 bits of data. Extract the various bitfields from them and setup the sprite for the fragment program

But I still have to do some extra calculations in the fragment program:

float X = scaleX * gl_TexCoord[0].s + offsetX
// .. rather than just using a texture coordinate prefabricated in the vertex shader:
float X = gl_TexCoord[0].s

If I calculate the range of texture-coordinates in the vertex shader, I have to use GL_QUADS instead of GL_POINTS and transfer 4 times as many data to the graphics memory. That is probably even worse as I feel bandwidth should be saved as far as possible.

So I take my chances and bet that the fragment shader can handle the small overhead of a multiply and add without becoming the real bottleneck.

well that depends on how big your particles are going to be - if the viewer is going to get close to them then you’re going to pay the price.

No larger than 8x8

I wish I could measure speed properly in linux, sigh :frowning:

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.