ARB_IMAGING subset

yang11 · April 13, 2001, 6:31am

I would like to apply a 2D filter to a dynamically generated texture image.
I use glCopyTexSubImage2D to copy the frame buffer to a texture buffer. Reading the spec, I got the impression that a convolution filter can be applied during the pixel transfer.

Has anyone done this before? How much is the performance overhead?

Thanks a lot.

Ruigang

mcraighead · April 13, 2001, 7:58am

The overhead is likely to be very large.

Matt

yang11 · April 13, 2001, 8:14am

Quite discuraged to hear this.

Is this a driver problem or due to the hardware architecture.
SGI and Nvidia support the ARB_IMAGING subset. Do you think they both have substantial performance penalities for enabling the convolution filter?

mcraighead · April 13, 2001, 8:25am

Only very-high-end hardware supports ARB_imaging in hardware. I don’t believe anyone in the PC space supports it. It would be much too expensive to build in…

Matt

Dodger · April 13, 2001, 9:52am

I’m not a hardware Guru or driver developer… but could it be possible to let the driver use the GPU’s multipliers for convolution filters, and would this (considering the overhead of sending multiplications to the hardware and reading back the result) result in higher performance than letting the CPU do it?

Or maybe there is a possibility to let the driver simulate convolutions by using multiple passes through texture environment combiners or register combiners…?
Just an idea.

mcraighead · April 13, 2001, 10:24am

It is generally assumed that ARB_imaging computations are done in full floating-point precision. They certainly need to have floating-point range.

You can implement your own 3x3 color matrix using register combiners on GeForce and up. However, matrix elements can only be in the [-1,1] range, and precision is limited.

My main suggestion would be that you could perform computations on pixels in floating-point using GL_POINTS and vertex programs. One example of this is the Mandelbrot set vertex program that we’ve illustrated. I’m not sure if that one is available in demo form.

Matt

yang11 · April 13, 2001, 10:31am

I just wrote a very simple test program.
The performance overhead is indeed huge.
With convolution on, the frame rate drops from 60fps to 30fps or less (I am using a nvidia GeForce 2 GTS), and the source image is really small (100x100)

A question for Nvidia gurus – is there any hardware support for the convolution functions?

another question: I don’t seem to get the right effect. I enabled convolution_2d and supplied a 2D filter, is there anything else I should do?

thanks.

imported_jwatte · April 13, 2001, 11:11am

Hey, the BOM for adding a M56k is less than $20. It can convolve like nobodys business with a little footwork. Especially if you do it in the frequency domain (although that’s substantially harder in 2D than 1D).

I believe the problem is lack of market demand rather than some inherent hardware cost.

mcraighead · April 13, 2001, 11:25am

In a market where pennies count, have fun selling people on a $20 chip.

Matt

zed · April 13, 2001, 4:14pm

where can i find the spec to ARB_imaging its not at the www.opengl.org extension register or in nvopenglspecs2 pdf

LordKronos · April 13, 2001, 5:17pm

Originally posted by zed:
where can i find the spec to ARB_imaging its not at the www.opengl.org extension register or in nvopenglspecs2 pdf

It is mention in the latest version of nvOpenGLspecs.pdf, but all it says is:

NOTE: This extension does not have its own specification document, since
it has been included in the OpenGL 1.2.1 Specification (downloadable
from www.opengl.org)..) Please refer to the 1.2.1 Specification for
more information.

Hopefully that should answer your question.

duckman · April 14, 2001, 10:09pm

hmm Im not a nvdia guru but yes you can take the aproach of using the geforce gpu to exelerate your operations on textures.

If it is possible that this particular operation could be done you might be a good entry for the compotition that nvidia is running at the moment.

If your interested in this area there is work done that shows how to generate plama
clouds from the gpu, I think there might be something on fire textures exelerated by using the gpu.

how about a diferent aproach, can you write the code optimized for 3dnow or something similar.

since Ideas are being throunw around…
by convex filter do you meen Like the effect of looking through curved glass if so, I think
cubic environment mapping might be usfull. Just turn off automatic normal generation and supplie normals for a convex shape…

Yer ok its just a thought, it may be stupid, and ill understand if attacts a few flames.

yang11 · April 16, 2001, 6:53am

I still can’t get the desired result
on a GeForce2 with the latest driver 11.xx.

Is there any sample code avaiable? I remember seeing a filter demo from a Nvidia presentation in our school. But I can’t find it in Nvidia’s website.

thanks

ffish · July 13, 2001, 7:08am

Originally posted by mcraighead:
You can implement your own 3x3 color matrix using register combiners on GeForce and up. However, matrix elements can only be in the [-1,1] range, and precision is limited.

How? I’ve come back to this topic recently because I want to automatically generate normals from texture slices by calculating voxel gradients using convolution filters as a preprocessing step. Like yang11, I can’t get ConvolutionFilter2D or SeparableFilter2D to work at all on my textures.

If I specify a convolution filter like GLfloat filter[3][3] = {0.0f, …, 0.0f}; I should get a black output (shouldn’t I?) but it doesn’t affect the output at all. I’m just binding one simple texture in either luminance or luminance_alpha mode (or do I have to use DrawPixels for convolution filters to work?). Do you just ignore convolution in current drivers Matt or am I doing something stupid?

How can I use the combiners or shaders to simulate a 2D filter? I can live with the [-1,1] restriction if I have to. I don’t understand how I can access the texels surrounding my result texel using the combiners while binding just one texture (is it possible?). I would have thought I would have to bind a ±x texture and a ±y texture respective to the result texel. A nvparse string example would be great if you’ve got the time, Matt.

Alternatively, if anyone has any suggestions for calculating the gradients in hardware cheaply on NV20 I’m welcome to suggestions.

Thanks for your time

yang11 · July 13, 2001, 7:57am

[QUOTE]Originally posted by ffish:
[b] How? I’ve come back to this topic recently because I want to automatically generate normals from texture slices by calculating voxel gradients using convolution filters as a preprocessing step. Like yang11, I can’t get ConvolutionFilter2D or SeparableFilter2D to work at all on my textures.

I have confirmed with Nvidia that there is a bug in the filter functions. This was with drive 11.xx. I have neither heard back from Nvidia, nor tested the 12.xx driver.

If you want to test your code, run it on a sgi machine, it will do what you expect.

ffish · July 13, 2001, 8:02am

Thanks for the update yang11. That’s a shame I’m using 12.xx’s and it’s not working. I guess I’ll do it in software myself. No big deal since it’s a preprocessing step for me and an easy one at that. Still interested in combiner solutions, though. Anyone?

yletourneau · July 13, 2001, 8:12am

I remember seeing a shader in NVEffectsBrowser on NVidia site that implemented a high-pass filter for edge detection. I think it worked by using the GF3’s 4 texture units to send the same texture 4 times but with a one texel offset in each direction (by playing with texture coordinates) so that you can actually access each neighboring texel of any given texel. If you need to access 8 neighbors instead of 4, maybe you can do multiple additive passes in the frame buffer if the filter you want to apply is a separable function.

Hope this helps.

Yannick

ffish · July 13, 2001, 8:34am

The offset texture method is what I was thinking I’d have to do. I’d preferably be accessing 8 neighbours (or 26 would be even better!). I remember seeing something that sounds like the demo you’re talking about a while ago but I couldn’t find it after a quick search of the nVidia site. I’ll have a more thorough search tomorrow. Thanks for the help.