(Shamelessly crossposted from http://www.nvnews.net/vbulletin/showthread.php?p=2180041).
I am taking my first foray into OpenGL programming and using the nvidia OpenGL library included with the driver. Everything I am tring to to works fine (it’s fast), until I try to do a colourspace conversion on my frame (e.g. to produce a ‘negative’ image.) I am tring to use the GL_COLOR matrix, like so:
Very slow (<1 fps). CPU maxed out. Resultant images are inverted as expected.
glMatrixMode(GL_COLOR); glLoadMatrixf(my_rgb_inversion_matrix); glCopyPixels(0,0,w,h,GL_COLOR);
Using the built-in identity matrix for the colour transform matrix, everything runs really fast (100+ fps), but obviously the image is not inverted
glMatrixMode(GL_COLOR); glLoadIdentity(); glCopyPixels(0,0,w,h,GL_COLOR);
But weirdly, if I supply my own identity matrix, then it is really slow again (and no inversion as expected)
glMatrixMode(GL_COLOR); glLoadMatrixf(my_identity_matrix); glCopyPixels(0,0,w,h,GL_COLOR);
I have a NVidia Quadro FX 3500, and using NVidia driver 190.53, all on RedHat Enterprise Linux 4.6.
The OpenGL documentation says that it’s only valid to call glMatrixMode with GL_COLOR if the GL_ARB_imaging extension is supported. Checking for this extension at runtime shows it’s present (as does nvidia-settings). Also calling glGetError() does not complain about me calling the function with this parameter.
HOWEVER, I am not doing anything clever with function pointers or anything to ensure that the imaging extension is available/used at compile/link time and if I try to use a function like glColorTable then it borks at compile time with undefined symbol.
So I think it’s one of these:
The imaging extension is not being used in the underlying implementation because I am not doing something at compile/run time that I should be doing. However, in this case I would expect glGetError would complain about me passing GL_COLOR into the matrix function if this was the case.
This is a horrible way to do colourspace conversion of full frames and I’m stupid for trying it. However, since openGL 1.x specifies it, I would expect it to be implemented in the hardware, but it’s clearly not for me.
For some reason, although the hardware supports this sort of colourspace conversion, the NVidia openGL library does not use the hardware to do it, and falls back to processing every pixel on the CPU, which is why im seeing such horrible performance.
Variation of 2 - it should be supported and work in hardware, but a bug in the driver means it’s falling back to a software solution.
I believe I could achieve a similar effect using a frament shader, however that is a lot more complicated for my tiny brain that barely understands openGL (after 3 days of using it). I am interested as to why this doesn’t work as I expect it to. I understand that the openGL spec does not say that anything has to be fast - but in that case, how do I code for multiplatform (gfx hardware) if some openGL operation might be arbitrarily orders of magnitude slower than I would expect?
For background info, if I do the colourspace conversion on the CPU before I pass the pixels into openGL, then it runs at 100s of FPS (at max CPU). I want to push the conversion to the gfx card so that I can do the conversion of N image streams at once without loading the CPU with the (naturally parallel) task of colourspace conversion.
Thanks for any and all info that people can give!