Sand pixels on ATI

Maxim_Shemanarev · April 11, 2006, 9:55pm

Below is a very simple code that draws 20K of very narrow translucent triangles.

    glShadeModel(GL_SMOOTH);
    glEnable(GL_BLEND);
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
    glEnable(GL_ALPHA_TEST);
    glAlphaFunc(GL_GREATER, 1/255.0f);
    glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
    glClearDepth(1.0f);
    glEnable(GL_DEPTH_TEST);
    glDepthFunc(GL_LEQUAL);
    glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_NICEST);

    glViewport(0,0,width,height);
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
    gluPerspective(45.0f, (GLfloat)width/(GLfloat)height, 0.1f, 200.0f);

    glMatrixMode(GL_MODELVIEW);
    glLoadIdentity();

    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    glLoadIdentity();
    glTranslatef(0.0f, 0.0f,-199.0f);
    glRotatef(-3, 0.0f, 0.0f, 100.0f);

    srand(123);
    float x1 = -80;
    float x2 =  80;
    float y1 = -75;
    float y2 = -75;
    float dx1, dx2, dy1, dy2;
    float d = 10000;

    glBegin(GL_TRIANGLES);
    for(int i = 0; i < int(d); i++)
    {
        dy1 = ((rand() & 0xFF) + 10) / d;
        dy2 = ((rand() & 0xFF) + 10) / d;
        dx1 = dy1/2;
        dx2 = dy2/2;
        glColor4d(1, 0, 0, 0.5);
        glVertex2f(x1,     y1);
        glVertex2f(x2,     y2);
        glVertex2f(x2-dx2, y2+dy2);
        glVertex2f(x2-dx2, y2+dy2);
        glVertex2f(x1+dx1, y1+dy1);
        glVertex2f(x1,     y1);
        y1 += dy1;
        y2 += dy2;
        x1 += dx1;
        x2 -= dx2;
    }
    glEnd();

The triangles are all CCW and they can’t overlap just by design. There are no T-junctions either. But the triangles are very narrow and have a slight slope. On my ATI FireGL Mobility I see a lot of sand pixels, with changing pattern when rotating. On nVidia it’s all perfect, no matter how you rotate or resize it. I use red color over black background, with 0.5 opacity. When pixels from different triangles overlap (painted twice) there are sands appear.

Does it mean ATI has stitcing problems? It looks like it does. Can someone comment it?

http://antigrain.com/stuff/narrow_tranlucent_triangles_ati.png
http://antigrain.com/stuff/narrow_tranlucent_triangles_nvidia.png

Zulfiqar_Malik · April 12, 2006, 12:21am

Do u have anti-aliasing enabled on nVidia card? What about polygon smooth hint?

Maxim_Shemanarev · April 12, 2006, 6:30am

No, all anti-aliasing is disabled.
Actually, when it’s enabled on ATI, there are much more sand pixels, but they look kinda “smoother” (which is quite expectable).
On nVidia I coudn’t reproduce a single sand pixel with any settings.

knackered · April 12, 2006, 3:03pm

i really don’t see what’s going on in that code…I’m seeing rand() being used to generate vertices, I’m seeing floats being used, I’m seeing precision errors all over the place.
Maybe I’m seeing things, but to my mind it’s a miracle you’re not seeing ‘sand’ on nvidia too.
If you generated the vertices in a vertex array then used an index array to stitch them all together, all would be well.

Maxim_Shemanarev · April 12, 2006, 3:56pm

The index array won’t help, I checked it.
The code must be clear, but I will explain it. There are two small positive random values are generated. Then a convex quadrilateral is calclated and it’s spit into two triangles. The accumulated error doesn’t matter, what matters is that all the triangles are perfectly stitched (same vertices have exactly the same coordinates). Well, unless “x+dx” and “x+=dx” give different results, but I don’t believe they do.

Perfect result on nVidia proves that there are no actual holes, neither overlaps, otherwise the sand pixels would be seen there too (I have checked it also, adding overlaps).

The code just reproduces the problem in the simplest possible way. The actual problem comes from polygon triangulation. On ATI there are some “seams” appear from time to time when rendering with translucency:
http://antigrain.com/stuff/narrow_translucent_triangles_ati2.png

And finally, alas, ATI confimed the problem:

It is possible that the subpixel precision on our older HW was lower than on Nvidia’s HW. I tested your sample on X1900 and it looks perfect.
So, aparently they consider Mobility FireGL V3200 as an “old” one.
I suppose the HW handles poorly narrow triangles, i.e., a narrow triangle may “flip”.

Maxim_Shemanarev · April 14, 2006, 4:20pm

To finalize the question, ATI can have not only overlapping pixels, but also sand holes (marked with red):
http://antigrain.com/stuff/ati_stitching.png

Humus · April 14, 2006, 8:21pm

Which OS is this?

Maxim_Shemanarev · April 15, 2006, 9:36am

It’s regular WinXP. Do you think it can be the video driver?

Humus · April 15, 2006, 5:00pm

It’s possible. If it would have been Linux then I wouldn’t have been surprised. I’ve seen that problem there myself, but only with antialiasing. So it can happen due to the driver obviously, but I haven’t seen anything like that on Windows, so I don’t know what’s going on there.

Who at ATI did you talk to? The explanation sounds a bit weird. Unless you have T-junctions or small differences between the vertices you shouldn’t get pixel leakage, regardless of subpixel precision. Of course, you may want to write it out to a vertex array (preferably indexed) to ensure you don’t have any LSB differences between instances of the same vertex. I wouldn’t neccesarily rely on a+b resulting in the same as a += b.

Maxim_Shemanarev · April 15, 2006, 9:10pm

Thanks for your comments.

I don’t think it’s because of the driver (but it can be).

The answer from ATI is “politically correct” and “diplomatic” that is, they never say it straight “yes, there is a problem”. I’ll know the name on Monday, when I have access to my office mail box.
I agree the answer not very relevant, it’s not about subpixel accuracy, it’s a question of edge stitching. Well, I suspect there’s something wrong with so called tie-breaking rules or something like that.

Theoretically strictly adjacent triangles must have flawless stitching. The problems appear when the triangles become too narrow:
http://antigrain.com/stuff/narrow_translucent_triangles_ati3.png

Consider two following cases:
http://antigrain.com/stuff/narrow_translucent_triangles_ati4.png

In the first one we have triangles ABC and ADE with very little distance CD (say, one LSB difference in floats). Suppose it results in one pixel hole along “AC”. But what if we just add another triangle, ACD? Theoretically it should add this very single pixel and nothing else. But you have to admit this task is very difficult even for monsters like ATI.

The second example is a T-junction and it’s more obvious. Initially we have only 3 triangles, and then add the fourth, CEB to paint over the T-tunction. Theoretically the mesh is correct, but when CEB is very narrow, point E can easily move to the CBD triangle (during the transformations and/or when rounding off), so that, the triangle will flip.

I admit the operations “+” and “+=” may have different results, but I don’t believe it is so in practice, and besides, the defects appear in other cases too, even when using an indexed vertex mesh.

Serge_K · April 16, 2006, 9:29am

Originally posted by Maxim Shemanarev:
Theoretically the mesh is correct, but when CEB is very narrow, point E can easily move to the CBD triangle (during the transformations and/or when rounding off), so that, the triangle will flip.
Do you mean “the triangle will change its orientation (frontfacing<=>backfacing)”?
Is culling enabled in your code?

Originally posted by Maxim Shemanarev:
I admit the operations “+” and “+=” may have different results, but I don’t believe it is so in practice, and besides, the defects appear in other cases too, even when using an indexed vertex mesh.
Well, since the results of both operations are rounded to 32bit precision in the same way (when glVertex2f is called) - there should be no difference.

Maxim_Shemanarev · April 16, 2006, 12:24pm

Do you mean “the triangle will change its orientation (frontfacing<=>backfacing)”?

Yes.

Is culling enabled in your code?
No, but when we have a case close to a T-junction it doesn’t really matter. The difference is that a pixel can be set 3 times when culling is disabled and only twice when enabled. Both are bad. T-Junctions are bad anyway, and worst of all they may appear implicitly, during the tessellation (more specifically, not the T-junctions themselves, but cases that are very close to T-junctions).
But as the practice shows on ATI there can be artifacts even with a simple triangle fan case.

Humus · April 17, 2006, 10:13am

Well, I looked into this and it seems that it is related to subpixel precision after all. Theorethically it shouldn’t happen, to me it seems that it should be fully possible to ensure that the computations for identical edges would be exactly the same, so that this wouldn’t happen regardless of precision, but I’m probably overlooking some issue involved.
Anyway, I could reproduce the problem on R300 and R420. I also tried this in D3D and it was the same, so it’s not a driver thing. Initially R520 was fine, but making the triangles even thinner made the problem occur there too. I also saw the problem on a GeForce 7800GTX. So apparently no card does this perfectly, it’s just a matter of how thin triangles they can tolerate, where R300/R420 seems to be a fair bit more sensitive than current generation cards. But even for R300/R420 you have to be pretty extreme for there to be a problem. You’re drawing triangles that each only contributes a couple of disconnected pixels.

Maxim_Shemanarev · April 17, 2006, 1:41pm

Thanks!

The problem isn’t artificial, actually. As I said before, it comes from polygon tessellation algorithms that may produce very narrow triangles in certain cases. I use a kind of a Seidel’s algorithm (monotonization), which is the cheapest one in practice. Well, Constrained Delaunay triangulation typically produces much nicer result, but it’s very computationally expensive (all the implementations I tried work 10-50 times slower).

Korval · April 17, 2006, 2:01pm

The problem isn’t artificial, actually.
No, it’s not artificial, but it’s not exactly common either.

That’s not to say that ATi shouldn’t be properly flogged for not being able to handle these cases, and for not having fixed it after 3 hardware revisions. But you have to figure that this stuff has been around since R300 (at least), which means it’s been a problem for 4+ years or so. It can’t be too serious if the first report of the problem required 4 years to find.

Unfortunate? Certainly. Boneheaded? Probably not (assuming they got some performance boost or transister logic tradeoff for not doing it the right way).

What you may have to do is create an algorithm that will find these sliver triangles and remove them topologically from the mesh after you tesselate it. Not a fast algorithm to be sure, but it certainly looks like you’ll need it to make sure that your code works on ATi hardware.

On the plus side, you’ll be removing these sliver triangles from your mesh, thus (slightly) improving running speed.

Humus · April 17, 2006, 3:22pm

Note that the problem exists on nVidia too. The X1900 and 7800GTX showed about similar sensitivity.

Originally posted by Korval:
On the plus side, you’ll be removing these sliver triangles from your mesh, thus (slightly) improving running speed.
Not just slightly. Drawing very thin triangles is very inefficient. One would think that a sample that just outputs a constant color and reads half a million triangles from system memory would be vertex or bus transfer limited, but it’s not. I get 88fps when looking at the triangle strip, and 183fps when looking in another direction. Modern graphic cards render in pixel quads. If you only draw a pixel here or there, you’re essentially wasting the majority of the available fragment shading power. And if you do antialiasing multisampling essentially becomes supersampling with this thin triangles. That 88fps became 10fps with 6xAA.
I don’t know the hardware details, but I wouldn’t be surprised if the rasterizer didn’t automagically find next pixel to render within one clock if there are large disconnects, so while I already doubt the rasterizer would be able to feed the fragment shader fast enough in the first place with this simple shader, I wouldn’t be surprised if it was even more idle than usual when pixels within the primitive are disconnected.

Maxim_Shemanarev · April 17, 2006, 4:24pm

Well, I agree it’d be nice to clean up the mesh as it really solved the problem and improved performance. But it’s expensive because we have to create some kind of adjacency table and modify a lot of data sometimes (in order not to add T-junctions, for example). And the task requires tessellation on the fly (Flash SWF rendering). So, in most cases we just tessellate a shape once, render, and forget about it forerver.

Stephen_H · April 17, 2006, 4:52pm

This kind of ‘sand’ problem is common in software renderers. Most people don’t notice this kind of thing in the software renderers because they don’t give them the right test cases. You need to be really careful when doing computations.

A well know example is computing the intersection of a triangle edge with a scanline. If that triangle has points (a, b, c) and you want to compute the intersection of a scanline with edge ab, it matters when you’re doing the intersection computation if you compute (a - b) or (b - a) which can have different results due to float precision stuff. If the edge ab is shared with another triangle, you must subtract (a - b) or (b - a) in the same way you did for the first triangle or you can get missing holes.

Maxim_Shemanarev · April 17, 2006, 6:00pm

In a software renderer, with anti-aliasing based on pixel coverage it’s not a problem. You see, if you had 256x or more AA, this artifacts wouldn’t be noticeable at all. I did some research and wrote a Flash compound shape rasterizer with perfect edge stitching and anti-aliasing:
http://antigrain.com/stuff/gouraud_mesh.zip
(Windows executable)
It’s equivalent to 65536x FSAA. Most important is that it becomes insensitive even to T-junctions. But it’s a very long way to go to have pixel coverage AA in hardware.

Korval · April 17, 2006, 6:06pm

But it’s expensive because we have to create some kind of adjacency table and modify a lot of data sometimes (in order not to add T-junctions, for example).
Have you considered using a Wing-edge or Quad-edge data structure? I did some tesselation work with quad-edge data structures (subdivision surfaces), and they were excellent for that task. They’d work pretty well for most triangular tesselation, I imagine. Finding sliver edges (where the distance between two verts is below some tolerance) is trivial, and removing said verts is equally trivial.

Now, the non-trivial part is converting from/to standard mesh structures. But that’s more time consuming than difficult to implement.