I am creating driver using OpenGL API. The problem I have is Texture Creation Time. I am using glTexImage. I have to create (not update) textures of approximate size 1920x1080 RGBA. Each operation I am performing can not exceed 10ms. In practice creation of one big texture takes a much longer. I can not pre create textures because they size and number changes over time.
I am using Quadro FX 4500.
What I can do to improve the situation?
Any suggestions will be appreciated.
well it looks like you could use glTexImage once and then use glTexSubImage to update data, can be faster as there is no reallocation to do.
Then, try to use different texture formats, such as GL_BGRA_EXT, some are quicker.
192010804*100 = 791MB per second, that is quite a lot.
Give a try to PBO texture streaming
EDIT : if really the texture size do change, use glTexImage2D for the largest possible, then texsub the interesting part, and change texcoords to cut out unwanted parts.
Apart from Zbuffer’s suggestion, i noticed that you mentioned that each “operation” must not exceed 10ms. I don’t know what is categorized as operation in your particular scenario, but you can perform serveral (sub)“operations” to upload texture data .
Thank you all for the help.
I am already streaming textures to and from GPU using PBO and BGRA format. Transfer speed is no problem. The problem is that I am working with GPU and other hardware. The other hardware is imposing on me read back from GPU every 16ms. About 4-5ms is used for streaming up and down images. I am left with time slot of about 10ms to do my other tasks. There are moments that for the contents creation I need few huge textures but after some time not interrupting the driver work I have to release them to create many small textures. Creation of huge textures is slow especially when the mipmapping is involved. One OpenGl call takes longer then 10ms.
I have also D3D version of the driver. Creation of D3D textures is a lot faster. The slow creation speed of OpenGl textures is probably resulting from usage by OpenGl managed textures. I do not need “backup” textures on the OS memory side. I am looking for some way around to speed it up. As the last resort I will create few big textures and I will partition them in my code to small ones but if it is possible I would try to avoid it.
Originally posted by Jan Z:
[b]I have also D3D version of the driver. Creation of D3D textures is a lot faster. The slow creation speed of OpenGl textures is probably resulting from usage by OpenGl managed textures. I do not need “backup” textures on the OS memory side. I am looking for some way around to speed it up. As the last resort I will create few big textures and I will partition them in my code to small ones but if it is possible I would try to avoid it.
Yes, a backup copy is kept by the driver but I don’t know what the case is for PBO. A backup can be kept anywhere from AGP mem, RAM, hdd.
It could be the mipmap creation process is slow. Are you using GL_GENERATE_MIPMAP?
Yes I am generating mipmaps. Are the mipmaps generated by default by hardware in the case of Quadro FX 4500?
I am setting GL_GENERATE_MIPMAP to TRUE.
I don’t know if the mipmapping is done by hw in all cases. Yes for power of 2 textures. For NPOT, I don’t know what they do. I think you have an equivalent of Geforce 6800 or 7800 which support GL 2.0 NPOT, which means mipmaps are possible. I don’t know if the actual mipmap generation is done by the hw.
You should test without using mipmaps.
Why would you need to allocate new textures continuously in the first place?
Since your application is so ‘texture generation intensive’ perhaps you should consider allocating a few larger textures just once and treat them as ‘texture maps’ (http://http.download.nvidia.com/developer/SDK/Individual_Samples/samples.html)
You can then decide which sub-rectangles of each texture will be allocated to whatever purpose you need them. Kindof like a memory allocation manager.
Do the large NPOT textures really have to be mipmapped?
If yes, could you use one (or more, if you can change the geometry to tile) POT texture(s) (where it seems h/w mipmapping certainly is supported)? Seems to me the memory overhead wouldn’t be worth mentioning if using one 2048x1024 and one 2048x64 to “emulate” 1920x1080.
One potential problem using this could be that if you generate all mipmaps (hint: if you don’t need all levels, see GL_TEXTURE_MAX_LEVEL to save both time and memory) the larger one would get 10 levels, but the smaller only 6.
Generating mipmaps is really expensive.
At 2k (POT) texture size this will take more than 10ms even with hw acceleration.
You should do some benchmarking on the Quadro hardware with POT textures and see if you can get anywhere close to the 10ms.