Map Texture Objects

If you are streaming a texture, the cost of of the format change to make the texture swizzled/tiled is going to be quite big next to the performance loss of the texture not being tiled/swizzled. However is a texture is static (or it’s contents are generated by the GPU and the GPU has the hardware bits to tile/swizzle it during render) then one wants it tiled/swizzled.

So… really back to exactly that which was stated in the beginning: provide an additional set of internalFormat enums that say “I want the texture linearly stored so I can map and stream it”.

How often you use a texture only on a single surface even when you update the texture every frame? Almost never, except the obvious case of video streaming. In that case, I agree it might make sense, but even then, it is questionable whether it would improve overall performance (I believe file loading and decoding will be still more expensive than doing an upload from a PBO).

But for all other cases, it is unlikely to even be as fast as PBOs + tiled textures, and as I understood most comments were talking about the general case, e.g. when you only do typical texture streaming as required by a “loading-free” renderer that displays huge worlds.

Tiling is not an issue. There are ways to make a tiled texture appear as linear to the CPU. After all, textures have to mapped in glTexImage2D anyway (except for some ancient Intel GPUs, which have more options), so exposing the map/unmap interface for textures doesn’t really add anything new. Swizzling as in ARB_texture_swizzle isn’t an issue either, because it has nothing to do with how the texture is stored in memory. What you probably thought is that the internal format might have components in a different order or it can be a completely different format (other than the user requested). Such information can be exposed by adding new glGet queries.

I don’t consider tiling and swizzling an issue at all. There would be no swizzling on the driver side anyway. The user would have to store the image in the native component ordering and in the actual format being used by the hardware.

What we need is usage flags for textures and one of them would be “I wanna upload, draw once, upload, draw once…”. Drivers would decide how to implement that (some would use a linear texture, others may take a different approach).

Because the problem is irrelevant.
It’s already been solved.
By glTexImage and glTexSubImage.
In the early 1990s.

One may well ask - why is there a fixation on raising this as an objection? Any vendor-specific internal representation is no business of OpenGL’s; OpenGL does not and should not specify anything in that regard. But yet it was jumped on in the very second post.

Think about this and it becomes really easy.

Map a texture for reading and what happens? The pipeline needs to stall, flush, and the driver can pull back the texture data and give you a pointer. What happens during that “driver pull back” stage is no business of the OpenGL specification and irrelevant to this suggestion. The driver can convert it from a tiled/swizzled format to linear or it can just suck it back from a linear internal representation if that is how the driver decided to store the texture. It can even give a direct pointer if that is what the driver decides is appropriate. It does not matter. It’s completely irrelevant. It’s internal driver behaviour.

Map a texture for writing and what happens? The driver just hands you a pointer. It does not matter what internal representation the texture used, you’re not going near that, you’re not going to read from the internal representation, this is a map-for-writing, you’re just writing to a pointer and you can assume that for the purposes of your program it’s linear. The pointer may be to the actual texture memory, it may be to a scratch memory region, it does not matter; that’s for the driver to decide. Unmap and what happens? The driver takes that data you wrote to that pointer and - if it gave you a scratch memory pointer - writes it back. Using the exact same code path that has been used by glTexImage and glTexSubImage since time immemorial.

That last point is key. The driver gets to decide when to do the write back. It can decide “OK, the texture is not currently being used for drawing, I can safely write back now without needing to incur a pipeline stall”. Or it can decide “not OK, the texture is currently being used for drawing, I’m going to keep this memory hanging around until it’s no longer used and write back then”. Or it can even decide “I gave the programmer a pointer to an internal linear representation so I don’t even need to do a write back”. But that’s internal driver behaviour.

So that’s why the “problem” is being ignored - because it’s about as relevant a problem as fears of asphyxiation on fast-moving trains.

Using the exact same code path that has been used by glTexImage and glTexSubImage since time immemorial.

Um, you are aware that ignoring the problem of swizzling/etc makes mapping, in virtually all cases (since virtually all textures are swizzled), no better than just using a Pixel Buffer Object with the implementation-preferred pixel transfer parameters (which, thanks to internalformat_query2, we can now ask for).

You’re basically saying that you want a feature that might give you performance, but you can’t rely on it in any real-world circumstances. Plus, without the ability to explicitly ask for unswizzled/etc textures, you can’t do anything to improve your chances of actually mapping the texture (rather than just a lame PBO).

Yes, there are times when mapping a buffer object means that you don’t actually get GPU memory. But you can do things to improve your chances, like double-buffering or using [var]GL_INVALIDATE_RANGE_BIT[/var] or [var]GL_UNSYNCHRONIZED_BIT[/var] or whatever. None of these make guarantees, but they do help; not using these techniques leads to degraded performance.

What you’re suggesting would, in virtually all reasonable scenarios, never give you an actual mapped pointer. And there is nothing you can do to affect that in any way whatsoever.

In short, if “mapping a texture” can’t give you a reasonable shot at getting a pointer to honest-to-God GPU memory, what’s the point? How is it any better than using a PBO?

It should be noted that the only actual OpenGL extension to provide this functionality does in fact have a parameter for asking for linear textures, which enforces a specific order. And it even forbids mapping at all if you don’t use it.

Or to put it another way, actual IHVs, people’s who job it is to make hardware go fast (theoretically at least; they are Intel;) ), who considered the problem decided that the swizzle issue was important to making mapping useful.

Tell me about it, because I’m not aware of any. Tiling/swizzling means hardware implementation dependent reodering of texels for better texture cache coherency.

No, they don’t. They don’t even have to be visible to the CPU. Memcpy-ing texture data to CPU visible video memory is not a common practice for a good reason.

Of course not, but that’s not what I was talking about. Component swizzle has nothing to do with tiling/texel swizzle that was discussed about in this topic.

Also, as Alfonse stated, if mapping a tiled texture would give you some arbitrary chunk of memory that the driver will upload to tiled video memory eventually is not going to help you at all compared to PBOs. In fact, I would go as far to state that the fact the same trick is allowed by the spec in case of buffer mapping is already something that just hurts.