Computing new gl_FragDepth value

Hi there!
(This is a question based on OpenGLES but it seems that category is specific to OpenGLES issues and this category is for any general GLSL issue)
I want to draw non-polygonal objects, in this case simply spheres. My approach is to draw a box around the “camera” since there are no real polygons to draw for each individual object and to change the colour and depth of each fragment based on a ray that I cast out and intersect with the sphere. I currently have a method of finding the intersection point similar to this wikipedia article’s method on Line-Sphere intersection (I would provide a link but I am unable to do so at the moment):

// Fragment shader. 
#version 320 es
precision mediump float;
// The dimensions of the sampler we use.
uniform uvec3 box_size;
// A sampler texture containing the sphere data.
// XYZ are the coordinates of each sphere and W is
// the radius.
uniform highp sampler3D points;
uniform mediump mat4x4 model;
uniform mediump mat4x4 view;
// This has to be highp since we use it in the vertex shader.
uniform highp mat4x4 projection;
// Scales the radius.
uniform float sphere_scale;

// The position of the box which we are processing.
in vec3 f_Pos;

out vec4 outColour;

void main() {
    vec3 line = normalize(f_Pos);
    // Whether the intersection was found or not.
    bool worked = false;
    // Loops over each of the spheres to try for an
    // intersection. I will most definitely improve
    // this as loops of this kind are slow.
    for(uint i = 0u; i < box_size.x; i++) {
        float p_x = float(i) / float(box_size.x);
        for(uint j = 0u; j < box_size.y; j++) {
            float p_y = float(j) / float(box_size.y);
            for(uint k = 0u; k < box_size.z; k++) {
                float p_z = float(k) / float(box_size.z);
                // Acquire sphere data.
                vec4 sphere = texture(points, vec3(p_x, p_y, p_z));
                // Transform the sphere coordinates into world space.
                vec3 center = (model * view * vec4(, 1.0)).xyz;
                float radius = sphere.w * sphere_scale;
                // From here I go through a sphere-line intersection.
                // Note that this assumes `o` from the wiki article is
                // the origin since the eye remains at the origin and
                // we transform the rest of the world around it.
                vec3 lc = line * center;
                float ll = dot(line, line);

                float determinant = dot(lc * lc, vec3(1.0));
                determinant -= ll * (dot(center, center) - radius * radius);

                // If we got a hit...
                if(determinant >= 0.0) {
                    worked = true;
                    // Find the actual intersection.
                    float sqr = sqrt(determinant);
                    // Negative intersection will be closer.
                    float dist_minus = (dot(lc, vec3(1.0)) - sqr) / ll;
                    // We've found distance so we have a point now.
                    vec4 point = vec4(line * dist_minus, 1.0);
                    // This is the point of this thread.
                    // How does one go about calculating this?
                    gl_FragDepth = (projection * point).w;

    // If we found an intersection then set colour.
    if(worked) {
        outColour = vec4(1., 1., 0., 1.);
    // It seems that not assigning anything doesn't do anything...?
    // Which is what we want; but somewhat unexpected.

In essence, I calculate a point of intersection (if there is one) and try to assign a new value to gl_FragDepth so that it will be processed as if it were at the right distance instead of the box’s position which is right up against the projection’s near plane.

If this is an XY problem please let me know; there may be a much easier way of drawing this.

I eventually want to do something like ray marching; hence my current attempt at a simple intersection.

The shader needs to execute a discard statement if the ray doesn’t intersect the sphere, i.e. if the discriminant (b^2-4ac) is negative. Simply not assigning to gl_FragDepth and/or outColor will result in those variables containing undefined values. In the case of gl_FragDepth, this means that you’re using an undefined value for the depth test, so whether or not it passes is undefined.

Unless you’re mixing ray-traced spheres with other geometry which is using gl_FragCoord.z as the depth value, the calculation of gl_FragDepth isn’t particularly important so long as it is monotonic (i.e. fragments farther from the viewpoint have greater depth values). You can use eye-space -Z (note: negative) or the interpolation parameter (dist_minus)

If you do need gl_FragDepth to be consistent with gl_FragCoord.z, transform the eye-space position by the projection matrix, divide Z by W, then convert from [-1,1] to [0,1]:

vec4 tpoint =  projection * point;
gl_FragDepth = (tpoint.z/tpoint.w+1.0)*0.5;
1 Like

Perfect! That works very well.

A few extra questions though;

I’m getting artifacts such as this one; I presume that’s a meduimp/highp issue?

I actually generate multiple spheres in my non-glsl code but I only seem to get one when rendering. I would have to assume that’s because the sampler3D clamps the values. Do you know of any way to make it not clamp my values; this is my current setup for a (supposedly) non-clamped texture:

fn create(memory: &[T], size: [usize; 3]) -> Result<Self, String> {
    let mut id = 0;
    let internal_format = T::INTERNAL_FORMAT;
    unsafe {
        gl::GenTextures(1, &mut id);
        gl::BindTexture(gl::TEXTURE_3D, id);
        gl::TexParameteri(gl::TEXTURE_3D, gl::TEXTURE_MIN_FILTER, gl::LINEAR as i32);
        gl::TexParameteri(gl::TEXTURE_3D, gl::TEXTURE_MAG_FILTER, gl::LINEAR as i32);
        gl::TexParameteri(gl::TEXTURE_3D, gl::TEXTURE_WRAP_S, gl::MIRRORED_REPEAT as i32);
        gl::TexParameteri(gl::TEXTURE_3D, gl::TEXTURE_WRAP_T, gl::MIRRORED_REPEAT as i32);
        gl::TexParameteri(gl::TEXTURE_3D, gl::TEXTURE_WRAP_R, gl::MIRRORED_REPEAT as i32);
            internal_format as i32,
            size[0] as i32,
            size[1] as i32,
            size[2] as i32,
            memory.as_ptr() as *const _);
        Ok(Self::new(id, size[0] as _, size[1] as _, size[2] as _))

Where T is four floats in an array and its associated constants are as follows:

impl TexData for [f32; 4] {
    const FORMAT: GLuint = gl::RGBA32F;
    const INTERNAL_FORMAT: GLuint = gl::RGBA32F;
    type Element = f32;
    const ELEMENTS: usize = 4;

(I apologize for the lack of syntax highlighting and the non-standard C/C++/Java; Rust isn’t supported as a syntax highlighter on some discourse sites)

Given that I can actually feel the phone getting warmer after about 3 minutes of running I assume that this is quite inefficient; would it be better to run this more as a post-processing step or in a CS or just optimize this specific shader alot?

Maybe. Try it on desktop.

Reading from a sampler doesn’t perform any clamping. Normalised textures are limited to the range [0,1] (or [-1,1] for an unsigned normalised texture), but that isn’t relevant if you’re using a floating-point texture. Note that sized formats aren’t valid for the format parameter; if the texture data is four floats per pixel, format should be GL_RGBA and type should be GL_FLOAT.

For many small spheres, you’d probably be better off drawing their bounding cubes with the centre and radius as attributes (possibly instanced, if the version you’re targeting supports instancing). Then the fragment shader only needs to consider a single sphere, and won’t be invoked for fragments known not to contain a sphere. Better still if you can render in roughly front-to-back order.

If you’re planning on extending this to recursive ray-tracing, you’ll need to use some form of spatial index rather than iterating over every sphere for every fragment.

1 Like

I cannot since my program isn’t structured that way (And would be a large hassle to restructure), but my phone apparently supports highp, so changing it to that and adding some lighting results in this:

Which is great!

That makes much more sense now that I look at the specification again (I personally find specifically this reference page to be difficult to navigate). Interestingly, changing that made the program alot slower (About 50% slower by visual inspection).

I’ll implement instancing then and get back to you; thanks for all the help!

Oh no! I just realized I have a large error in my code; I meant to break out of multiple loops and not just the innermost one.

Are there equivalents to loop tags or the such instead of goto/setting the ending loop variables? That is unless there’s a more optimized way of doing so.

Out of curiousity; since I’m not going to use the loop after all and will just stick to instancing.

No. break and continue only apply to the inner-most loop, and there isn’t a goto statement. The only way to abort multiple loops is with return or discard.

Bear in mind that an early exit isn’t guaranteed to save time. Older hardware doesn’t have jumps; break etc continue to execute the loops but with assignments and other side-effects disabled. Newer hardware can perform jumps but only once all invocations within a group (warp, wavefront) have requested termination.

Even without instancing, rendering only the bounding cubes is likely to be quicker. Instancing just avoids the need to duplicate per-cube data for each vertex. The same effect can be obtained by having a single integer attribute which is used to index into a uniform array (a texture or SSBO can be used if you need to exceed the maximum size of a uniform array).

1 Like

Thank you for the exceptional help! I’ve done as suggested and used instancing as it is available (I’m on OpenGL ES 3.2 because of geometry shaders, and it was introduced in 3.0 according to the docs although I may eventually go down a version or two).

Another issue has popped up; I realized that I have a broken sphere implementation somewhere, so as to “bound” it in a cube of sorts. To better explain this, here’s a video:

(I apologize for the bad quality; I don’t know of any other free video hosting sites)

I’m not completely sure as to what is causing this; the bounding cubes are all there correctly, and the code should reflect that:

// Fragment
#version 320 es
precision highp float;
uniform vec3 light;
uniform vec4 light_colour;
// This has to be highp since we use it in the vertex shader.
uniform highp mat4x4 projection;
uniform highp mat4x4 view;

in vec3 f_Pos;
in vec3 f_SpherePos;
in vec3 f_SphereSize;

out vec4 outColour;

float closest_intersection(vec3 abc, vec3 L, vec3 C) {
    // Scale the ray and ellipsoid into a sphere:
    L /= abc;
    C /= abc;

    // From here I go through a sphere-line intersection.
    // Note that this assumes `o` from the wiki article is
    // the origin since the eye remains at the origin and
    // we transform the rest of the world around it.
    vec3 lc = L * C;
    float ll = dot(L, L);

    float discriminant = dot(lc * lc, vec3(1.0));
    discriminant -= ll * (dot(C, C) - 1.0);

    // If we got a hit...
    if (discriminant >= 0.0) {
        // Find the actual intersection.
        float sqr = sqrt(discriminant);
        // Negative intersection will be closer.
        float dist_minus = (dot(lc, vec3(1.0)) - sqr) / ll;
        return dist_minus;
    } else {
        return 0.0;

void main() {
    vec3 line = normalize(f_Pos);

    float distance = closest_intersection(vec3(1., 1., 1.), line, f_SpherePos);
    // If we got a hit...
    if(distance != 0.0) {
        vec3 point = vec3(line * distance);
        // Go through lighting process. The light colour will eventually be replaced
        // with some kind of reflection or just become semi-opaque. 
        vec4 color = vec4(0.3, 0.89, 0.87, 0.1);
        vec3 P = point;
        vec3 e_n = normalize(-P);
        vec3 n = normalize(point - f_SpherePos);

        vec3 L = (view * vec4(light, 1.0)).xyz;

        vec3 l_n = normalize(P - L);// -norm(L - P);

        vec3 l_r = reflect(l_n, n);

        float b_spec = clamp(dot(e_n, l_r), 0.0, 1.0);
        float b_diff = clamp(dot(n, l_r) + 0.2, 0.2, 1.0);
        outColour.a = 1.0;
        outColour.rgb = (b_spec * light_colour).rgb;
        outColour.rgb += b_diff * color.a * color.rgb;
        outColour.a = max(b_spec, b_diff * color.a);

        vec4 tpoint = projection * vec4(point, 1.0);
        gl_FragDepth = (tpoint.z / tpoint.w + 1.0) * 0.5;
    } else {
        //Uncomment to see bounding boxes.
//        outColour = vec4(1., 0., 0., 1.);
//        vec4 tpoint = projection * vec4(f_Pos, 1.0);
//        gl_FragDepth = (tpoint.z / tpoint.w + 1.0) * 0.5;

        outColour = vec4(0.);
        gl_FragDepth = 1.0;
// Vertex
#version 320 es
uniform mat4x4 projection;
uniform mat4x4 model;
uniform mat4x4 view;

in vec4 v_Pos;
in vec3 iv_SpherePos;
in vec3 iv_SphereSize;

out vec3 f_Pos;
out vec3 f_SpherePos;
out vec3 f_SphereSize;

void main() {
    vec4 pos = v_Pos;
    // Scale this since I always pass in a cube with width 1.0 to not worry about size. *= iv_SphereSize;
    // Move bounding box to where the sphere is. += iv_SpherePos;
    // Transform into view space.
    pos = view * model * pos;
    gl_Position = projection * pos;
    // f_Pos is used as the coordinate relative to the eye (Which is always at the origin).
    f_Pos =;
    // Transform the size w/o translations because the size should only rotate and scale.
    f_SphereSize = (view * model * vec4(iv_SphereSize, 0.0)).xyz;
    // Transform the pos with translations because it's a position.
    f_SpherePos = (view * model * vec4(iv_SpherePos, 1.0)).xyz;

I realize I’m being rather obnoxious just posting a large wall of code and want to apologize for doing so, but I’m rather clueless as to where this could possible occur.
If needed I can add more code; but I’d have to assume I didn’t solve my intersection equation right as though I ended up with this:

d = (
    2 * (CxLx + CyLy + CzLz) ± 
            ||-2LC||² - 4(L • L)(C • C - r²)
    ) / 2 (L • L)

Which I managed to simplify by factoring out the 4 in the discriminant:

d = (
    CL • (1, 1, 1) ±
            ((LC)² • (1, 1, 1))  - (L • L)(C • C - r²)
    ) / L • L

Where C is the center, L is the line direction normalized, and r is the radius. Since I scale by the reciprocal of the size of the sphere, r = 1.
Also, multiplication of points/vectors is shader-style component wise multiplication (so CL = *

FWIW, I get: (L·L)t²-2(L·C)t+(C·C)=r², meaning that the discriminant is 4(L·C)²-4(L·L)(C·C-r²)

The issue appears to be that you’re replaced (L·C)² with ((L*C)*(L*C))·<1,1,1>, and they aren’t the same. Note that (a*b)·<1,1,1> is just a·b, so (L·C)² = ((L*C)·<1,1,1>)² which isn’t the same as ((L*C)*(L*C))·<1,1,1>.


float lc = dot(L,C);
float discriminant = lc*lc - dot(L,L) * (dot(C,C) - 1.0);

Also: the way that you’re handling the size won’t work in general. For an axis-aligned box, you can represent the scaling transformation as a vector. But once you rotate it it’s no longer axis-aligned. So you’d need to convert iv_SphereSize to a 3×3 matrix and transform that by the 3×3 model-view matrix. Then instead of dividing by a size vector you’d multiply by the inverse scaling transformation.

FWIW, that’s basically the approach I’m using here (fragment shader). I start with the NDC positions where the ray intersects the near and far planes, then transform both points into unit-sphere space via the inverse MVP matrix.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.