I am fairly new to OpenGL Programming an I have a question concerning hardware/software-performance.
I work on a Win2k platform with a ASUS GF 4 card (AGP slot). I have written a multithreaded Win32 application using VC 6 which renders exclusively two-dimensional. Basically one thread writes random data in a simple unsigned byte array . The values written to this array (ranging from values 0 to 255) shall serve as texture data that gets displayed two coloured (0 is equivalent to black and 255 to the brightest yellow). The other thread receives this array by means of a message pipe and is responsible for displaying the textures on a GL_QUADS primitive. I therefore use the command glTexSubImage*() in order to constantly replace the last texture data resident in texture memory with the newly created texture data from the array resp. the pipe. The frequency I would like to realize in this process is 8 kHz which means that 8 Mbyte/sec texture data accumulate and have to be rendered. But performance is quite poor and I only can realize a frequency of about 3.5 kHz before my CPU usage reaches 100% and the textures start to jitter.
I would like to know if there by any chance is a way to increase the performance. Isn’t this performance a little bit low? Where might the bottleneck be (e.g. transferring the texture data from system memory to texture memory, message pipe)?
I am really stucked with this problem and would be very thankful for any hints and pieces of advice.
The threading and the pipe is probably where alot of your overhead is. Do you actually need to multithread? There are probably alot of unnecessary copies going on.
Uploading the texture data could be a bottleneck (just to make sure I’m understanding correctly, you’re trying to upload a 1024x1 8bit texture 8000 times per second?). The 8MB/s can quickly turn into much more if the driver has to do temporary copies.
The other possibility, especially since you’re saying that the CPU usage hits 100%, could be the code that fills the texture. You mentioned random data - random number generators can be quite slow. If you’re actually generating a random number for each texel, that’s 1024 random numbers per frame. If you top out at 3.5K frames/s, that would be around 3.5 million random numbers per second - I assume this is your main bottleneck.
a) try a faster random number generator
b) if texture upload turns out to be a bottleneck, consider using GL_NV_pixel_data_range to write your data directly to video memory and avoid unnecessary copies.
Or you may want to consider rendering semi-random data to a pbuffer (e.g. by rendering several larger pre-generated random textures on top of each other with different blend modes and random offsets) and using the pbuffer as a texture. That way you wouldn’t be limited by the CPU and avoid texture uploads altogether.
Thanx for your advice. It was really helpful because you were right. The random data generation is quite slow and I removed it. Also the multithreading which is essential for my project is lowering the performance significantly. At the moment I am trying to find other solutions and workarounds because displaying and refreshing a simple texture 8000 times a second is not the problem with my graphics card as I found out.
Any further advice and hints are always welcome.