help: copy default depthbuffer to texture; test in vertex shader

bootstrap · September 17, 2013, 5:32pm

My application needs to do the following:

1: Render the entire environment and objects to the default framebuffer and depthbuffer per usual practice.

2: Draw 1+ million point vertices in point-sprite mode.

But there is a problem, a strange requirement that makes this process difficult.

Each point-sprite will be 1x1 to 65x65 pixels in diameter depending on the RGBA of the vertex (the brighter the vertex, the larger the point-sprite). The vertex shader can inspect RGBA and set gl_PointSize accordingly, so that part is not a problem.

The problem is this.

The entire point-sprite must be drawn or not-drawn depending upon whether the value in the depth-buffer that corresponds to the provoking vertex (exact center of the point sprite) is 1.000 or not.

The purpose is this. When any depth-buffer pixel contains 1.000, this [presumably] means nothing has been rendered on that pixel, and therefore stars are not obscured by any object rendered during this frame.

Why do this? The answer is simple - to correctly represent real phenomenon. Except the sun, all stars are, for all practical purposes, point sources — their sizes are infinitesimally tiny and therefore for practical purposes zero. However, bright stars look much larger than faint stars because the bright light is scattered in the eyeball, in camera lenses, on the light sensor in the eyeball, on film, on CCD and CMOS image sensors, etc. The point is, the actual light source is zero in size, but the image blur or bloom can be quite large (even larger than 65x65 pixels sometimes).

So what? Let me explain with an example that makes the problem obvious. Imagine you are lying in a lawn chair in some remote location with a clear, dark sky full of stars. As the earth rotates and the stars slowly drift across the sky, you stare at a bright star slowly approaching your flag pole (or the side of your barn). As this bright star gets closer and closer to the obstacle, the star will remain at full brightness and the big honking 65 pixel image blur will begin to overlap the obstacle. Why? Because the source of all that light is the star, and the star is infinitesimally tiny. The blur happens in your eyeball and on the light sensor in your eye.

So the star will continue to slowly approach the obstacle until the exact center of the 65 pixel diameter blur intersects the obstacle, at which point half the 65 pixel blur overlaps the image of the obstacle. At the exact instant the star (and the center of the 65 pixel blur) reaches the edge of the obstacle, all starlight is instantly obscured and the entire 65 pixel blur instantly vanishes.

This is the reason the entire 1x1 to 65x65 pixel diameter point-sprite must be either fully displayed or fully discarded based upon whether the original point vertex falls on a previously rendered pixel, or a never-rendered pixel.

I have searched and searched for tricky ways to deal with this problem with conventional techniques and conventional characteristics of the [latest and greatest versions of the] OpenGL pipeline, and come up empty.

But I must represent this phenomenon correctly in my application. The only way I can find to represent this phenomenon correctly is through the following steps each frame:

#1: Render the entire scene to the default framebuffer/depthbuffer in the conventional manner.

#2: Copy the default depthbuffer to a texture (attached to an FBO if necessary).

#3: Enable new vertex and fragment shaders designed for rendering the stars.

#4: Draw the 1+ million vertices with point-sprites enabled.

#5: In the vertex shader, compute where the vertex will be drawn in the framebuffer and access the texture the depth-buffer was copied to by step #2 above. If the vertex shader finds the value in the texture at that location is less than 1.0000, then the vertex shader discards the vertex. *

The vertex shader cannot actually discard vertices, so probably it would need to set the coordinates of the vertex behind the camera, or perform some other nefarious trick to force the vertex to be clipped off during the very next step of the fixed-function pipeline. This should, I assume, prevent the point sprite from being created — especially if I also set gl_PointSize == 0.0000 ???

Okay, now that you have about 1000 times more background than you wanted to hear, the following are my quetions.

#1: How does my app copy the default depthbuffer to a texture (regular texture or attached to an FBO)?

#2: What are the advantages to copying to a regular texture versus a texture attached to an FBO? Off hand it seems wasteful to copy to an FBO, because the FBO probably requires the code create and attach a huge chunk of memory to the color-buffer attachment when in fact that has no purpose.

#3: What should the format of the texture be that the default buffer gets attached to? Since the application will be drawing millions of vertices each frame, the efficiency of each test matters. Therefore, if a huge cache advantage exists by making the format of the texture 1-bit per pixel or 8-bits per pixel instead of the 24 or 32 bits per pixel the depthbuffer contains, is there some way to make a copy to a smaller format still support the required test (detect “pixel rendered upon versus not”).

#4: What function is the most efficient to copy the default depth-buffer to a regular texture, or a texture attached to an FBO? Also, does the target/destination of the copy need to be the depth component of an FBO, or can it be the color component (given an appropriate format is specified).

#5: What is the appropriate format to specify for the target to make sure the process works correctly, and to optimize speed of rendering the vertices (point-sprites)?

PS: I did occur to me that I could try to figure out a way for every fragment shader to determine whether the pixel being processed would ultimately cause a write to the framebuffer and depthbuffer. This seems a bit complex to figure out, but also has a problem. It means my app could not let anyone render with (previously tested and working) fragment shaders, because they do not contain the code to perform this process. As a result, this approach doesn’t sound so brilliant.