Difference in performance using glTexImage2D and glCompressedTexImage2D

It appears that glTexImage2D is faster than glCompressedTexImage2D by about 80 - 100%.

I have a code which use glTexImage2D to load uncompressed texture data (e.g. RGB24) and glCompressedTexImage2D for compressed data (e.g. S3DXT1).

I have time the execution on both Nvidia GPU and Intel HD4000 graphics. The timing difference is similar when using compressed and uncompressed. I have only profile this on Windows platform using the latest NVIDIA and Intel drivers. For NVIDIA driver, I have specifically enable full graphics performance using its control panel.

I will like to understand on why compressed data takes longer to load onto GPU whereas uncompressed data performance is better.

Will appreciate if somebody can enlighten me on this. Thanks in advance.

How are you determining this?

Let’s see some timing code.

Also, how are you uploading it? Directly, or via PBO? If the latter, how long are you waiting before you latch the data?

glCompressedTexImage should be faster in all circumstances, as it has to send much less data to the GPU (note: I’m ignoring cases where the texture being updated may be used by a pending draw call here). The only situation I can think of where it may not be is if it needs to do any format conversion during the load, but the API doesn’t allow for that. Perhaps those NV control panel options may be causing your driver to do such a format conversion? Try disabling those texture quality options - put everything back to “application-controlled” and do the tests again. Otherwise you’ve got a bug in your own code around your loading/timing, or you’re not doing a 100% apples-to-apples comparison.

A wild guess is your gpu reorders the texture to D3D-style layout. Top-left [0,0] coord and offset of 0 being top left = bottom-left [0,0] coord and offset of 0 being bottom left… for uncompressed data. But for compressed blocks, in each block the bitfields per texel need to be vertically flipped.

Another issue may be that you have automatic mipmap generation enabled, that would require decompress/downscale/recompress to next lower level steps for the compressed format, while uncompressed textures need only the downscale step.