frambuffer_object combined with vertex texture fetch

Hi!

I want to set up a framebuffer_object with a depth_component texture to use it for rendering data afterwards. I want to use it as a vertex texture when drawing so no time will be lost for copying.

So far so good but here it says that the only hardware accelerated texture formats for vertex textures are GL_RGBA_FLOAT32_ARB, GL_RGB_FLOAT32_ARB, GL_ALPHA_FLOAT32_ARB, GL_LUMINANCE32_ARB, GL_INTENSITY32_ARB, GL_FLOAT_RGBA32_NV, GL_FLOAT_RGB32_NV, GL_FLOAT_RG32_NV, or GL_FLOAT_R32_NV.

And here it is said that “DEPTH_COMPONENT textures are treated as ALPHA, LUMINANCE, or INTENSITY”. Although that’s the specification for a fragment shader I have no idea which format otherwise is right.

So the only possible formats seem to be GL_ALPHA_FLOAT32_ARB, GL_LUMINANCE32_ARB and GL_INTENSITY32_ARB.

But these formats don’t even exist !

Does anyone has an idea how to solve this?

Depth buffers are integer formats. You can’t use those as a hardware accelerated vertex textures directly. There needs to be at least one conversion in the path.

How do you want to use the depth data as vertex texture? (Thinking about “render to vertex arrays”.)

I think my framebuffer_object is a floating-point fbo.

Currently I do use “render to vertex arrays”. But to be able to use the data I have to copy the pixels (with glReadPixels) out of the renderbuffer (which is connected to the fbo) into the vertex buffer.

My aim is to avoid the copying.

I don’t quite get it. You have an FBO with color AND depth?
The color buffer can be float, the depth component will be 24 bits in an unsigned integer. You cannot have a floating point depth buffer render target.

Nobody keeps you from rendering user defined float data into a float buffer though.
If you only wanted one component, you could render into a GL_R32F type, which is possible on NVIDIA according to this document:
http://download.nvidia.com/developer/presentations/2005/SIGGRAPH/fbo-status-at-siggraph-2005.pdf

Bind that as HW accelerated R32F vertex texture and you could source it per vertex with a texture coordinate.

The point is, this is probably slower than copying the rendered data to a vertex buffer object (VBO) using the glReadPixels on a pixel buffer object (PBO) memory block which becomes your VBO attribute data. If done right the data should stay in video memory. All theory here. :wink:

Originally posted by Relic:
[b]I don’t quite get it. You have an FBO with color AND depth?
The color buffer can be float, the depth component will be 24 bits in an unsigned integer. You cannot have a floating point depth buffer render target.

Nobody keeps you from rendering user defined float data into a float buffer though.
If you only wanted one component, you could render into a GL_R32F type, which is possible on NVIDIA according to this document:
http://download.nvidia.com/developer/presentations/2005/SIGGRAPH/fbo-status-at-siggraph-2005.pdf

Bind that as HW accelerated R32F vertex texture and you could source it per vertex with a texture coordinate.

The point is, this is probably slower than copying the rendered data to a vertex buffer object (VBO) using the glReadPixels on a pixel buffer object (PBO) memory block which becomes your VBO attribute data. If done right the data should stay in video memory. All theory here. :wink: [/b]
You mean this isn’t possible?
glTexImage2D(texture_target, 0, GL_DEPTH_COMPONENT24, texWidth, texHeight, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);

I am using GL_FLOAT as a type and my rendering works fine… under my FBO code

this is probably possible, but it will not give you floatingpoint precision in the stored texture (GL_FLOAT in this call only defines the type you are passing as texture data… the internal format is still 24 bit integer)

[b]Originally posted by Mars_9999:
You mean this isn’t possible?
glTexImage2D(texture_target, 0, GL_DEPTH_COMPONENT24, texWidth, texHeight, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);

I am using GL_FLOAT as a type and my rendering works fine… under my FBO code[/b]
This topic is about render to texture on an FBO.
You’re talking about texture download.
What your call means exactly is

glTexImage2D( // Download or create a 2D texture
  texture_target,  // To this texture target 
  0, // Root level of detail
  GL_DEPTH_COMPONENT24, // Internal format will be 24 bit unsigned integer
  texWidth, // Width and
  texHeight, // height of the internal data. Input data must fit taking pixelstore settings into account.
  0, // no border texels
  GL_DEPTH_COMPONENT, // User input data is one component which means depth.
  GL_FLOAT, // User input format is 32 bit IEEE floating point.
  NULL) // No user data given just create the texture object.

Of course this works, but for fast up- and downloads of depth data you should use GL_UNSIGNED_INT (with the data in the most significant bits).

You mean, if I replaced GL_FLOAT with GL_UNSIGNED_INT, hardware accelerated vertex texture fetches should be supported?

In theory glReadPixels from FBO to VBO should be faster because everything is copied from GPU to GPU, not to RAM, that’s my opinion, too. But I’m not sure if that is the case in practise…

No, you both misunderstood each of your separate issues.

HW accelerated vertex texture fetches on NVIDIA boards strictly require R32F or RGBA32F internal formats. Period.
Read the texture format tables in this document:
http://developer.nvidia.com/object/gpu_programming_guide.html

Again, there is no floating point depth buffer inside the video memory on today’s consumer hardware. => You cannot have an FBO with depth attachment which is float type.

Reading and writing a depth buffer is fastest with formats where no conversions need to be applied.
GL_UNSIGNED_INT will just return the 24 bits in the most significant bits.
Using that data as vertex attribute will not be the fastest path. The vertex pipelines deal best with float data. So there is your dilemma.

You would need to readback your depth data as floats.

But this is not a problem if you could render your data to a float color buffer (not a depth buffer!) which contains the data you want to readback. Float to float needs no conversion so that should be fast.

Sorry, I can’t exaplin it simpler without more explanations what you actually render and how you’re using that data.

Okay I see…

So my last chance is to render to a float color buffer texture (A32B32G32R32F) and use a fragment shader to only write the depth. Then using that as accelerated vertex texture.

Right? :slight_smile:

You still haven’t described what you’re planning exactly.
Writing depth into a RGBA32F buffer seems like a waste. That document in the link I gave says 1-component FBO rendering to R32F is supported on NVIDIA hardware.
If you really need depth information from a complex rendering which does depth testing, yes, you can write the current fragment depth into the red channel of the color buffer WHILE you have a depth buffer active during rendering. That way you would get the visible depth information in float format. Remember that you need to clear that color buffer with the glClearDepth value which is 1.0 normally.
(Ok, good idea. Why didn’t I write this the first time? :smiley: )
That texture would be directly usable as vertex texture.
It remains to be tested what of the four variants is the fastest:
1.) Render to depth, copy-convert to float, use as VBO.
2.) Render to depth, copy-convert to float, use as vertex texture.
3.) Render depth data to float color buffer, copy float, use as VBO.
4.) Render depth data to float color buffer, use as vertex texture.

4 is what you wanted. Depending on the size of the data 1 or 3 could be faster since reading the vertex texture needs a vertex attribute anyway and the lookup is not for free.

What I’m planning is: Render depth data into a texture while modifying it with vertex/pixel shaders and then use this modified data for a second rendering pass. As said this already works with glReadPixels but I would like to test how fast it could get using VTF.

I agree with you in terms of these 4 variants (btw: thank you for your encouragement so far!)

In the meantime I tried number 2 and 4 (number 1 already works) but i don’t get my FBO running with such a texture attachment - what was my initial concern.
I’ve tried these two attachments:

glTexImage2D(GL_TEXTURE_2D, 0, GL_FLOAT_R32_NV, WIDTH, HEIGHT, 0, GL_R, GL_FLOAT, NULL);

glTexImage2D(GL_TEXTURE_2D, 0, GL_FLOAT_RGB32_NV, WIDTH, HEIGHT, 0, GL_RGB, GL_FLOAT, NULL);

Attaching it with

glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_TEXTURE_2D, texture_id, 0);

The result is an GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT_EXT.

1.) You don’t want RGB32F, but RGBA32F, because RGB32F is not an accelerated vertex texture format.
2.) GL_R is not GL_RED! GL_R is used for TexGen, TexImage needs GL_RED.
Added glGetError calls while debugging would have caught an invalid enum.
3.) Lookup the extension which defines the GL_FLOAT_*_NV types. NV_float_buffer extension http://oss.sgi.com/projects/ogl-sample/registry/NV/float_buffer.txt
and that says:
“floating-point texture maps must be 2D, and must use the NV_texture_rectangle extension.”
4.) Texture target GL_TEXTURE_2D needs GL_RGBA32F_ARB from ARB_texture_float
http://oss.sgi.com/projects/ogl-sample/registry/ARB/texture_float.txt

Means, try your first line with GL_TEXTURE_RECTANGLE_ARB target, GL_FLOAT_R32F_NV internalFormat and GL_RED format
or your second line with GL_RGBA32F_ARB == GL_RGBA_FLOAT32_ATI from http://oss.sgi.com/projects/ogl-sample/registry/ATI/texture_float.txt)) and GL_RGBA format.

(Yes, I’m one of those guys reading manuals. :wink: )

Okay. Now it works. :wink:

I’ve done number 4 now. But RGB32F IS an accelerated vertex texture format as written in the document I postet on top and as experienced now:

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB32F_ARB, SHADOW_MAP_WIDTH, SHADOW_MAP_HEIGHT, 0, GL_RGB, GL_FLOAT, NULL);

But I’ll try number 2 as well.

What I quite don’t understand is that I encounter a speedup up to a factor of 2 ! I have got an AGP-Card (6800GT).
Apparently glReadPixel from FBO to VBO goes over memory (?).
With VTF enabled, the framerate never drops under the one without VTF…

Thanks for your great help, Relic. I must admit, I didn’t read all manuals but I also experienced that each manual teaches something else. And some documents (e.g. the one postet on top) actually pretend formats that doesn’t even exist?

Originally posted by Schleichmichl:
[QB]Okay. Now it works. :wink:
I’ve done number 4 now. But RGB32F IS an accelerated vertex texture format as written in the document I postet on top and as experienced now:

No, not really. But it’s stored internally as RGBA32F according to this document
http://download.nvidia.com/developer/OpenGL_Texture_Formats/nv_ogl_texture_formats.pdf

Accelerated vertex texture formats on GF6 and GF7 are only R32F and RGBA32F, see table 4.6 last column:
http://developer.nvidia.com/object/gpu_programming_guide.html

If you don’t read manuals, this one is a must if you program for NVIDIA GPUs!

I think the “FLOAT_RG32F” is wrong in the OGL 2.0 docs list you cited.

What I quite don’t understand is that I encounter a speedup up to a factor of 2 ! I have got an AGP-Card (6800GT).
Apparently glReadPixel from FBO to VBO goes over memory (?).

Did you define a pixelbuffer object (PBO), which becomes your VBO later, as I said?
Hmm, AGP, dunno. On PCI-Express hardware readback should be faster.

Thanks for your great help, Relic.
My pleasure. :slight_smile:
“I love it when a plan comes together.” - Hannibal Smith, A-Team.

Okay I see… Going to read more manuals in the future. :slight_smile:

I bound my VBO with a PBO before copying, yes.
But isn’t it the case that on PCI-Express, readback should only be faster when copying memory to GPU or when using CPU? :rolleyes:

Readback would be from video to video memory if all went well.
But AGP or PCI-Express memory can be used to store textures and geometry in hardware addressable ways. Copying to there would be slower than video to video, and PCI-Express got better there.

PBO’s are tricky at times. Here’s even some more stuff to read:
http://download.nvidia.com/developer/Papers/2005/Fast_Texture_Transfers/Fast_Texture_Transfers.pdf
and this
http://developer.nvidia.com/attach/6427

Test this:
http://download.developer.nvidia.com/dev…s/UserGuide.pdf
App is here:
http://download.developer.nvidia.com/developer/SDK/Individual_Samples/featured_samples.html

Ok, you should be able to use the search function yourself by now. :wink:

Ok. :wink: Thanks!