glGetTexSubImage

I have a need for a glGetTexSubImage command. I won’t explain how it should work, because it’s obvious.

We perform terrain editing on the GPU using shaders:
http://www.youtube.com/watch?v=t1OOxpO-bZA

Since the heightmap modification occurs on the GPU, we have to retrieve that data back to system memory, for physics and raycasting. Modifying a small section of terrain requires that we retrieve the entire heightmap, when we really only need a subsection of it. This creates a noticeable delay that is unnecessary.

Did you try using an FBO to read the texture data? ie, Attach the texture to an FBO, bind it as the read-FBO, then call glReadPixels() on the area you want to retrieve. It’s not a single API call, but it can be bundled up into a convenient function.

Did you try using an FBO to read the texture data? ie, Attach the texture to an FBO, bind it as the read-FBO, then call glReadPixels() on the area you want to retrieve.

The thing is, if they’re doing shader work to compute this texture, then either:

A: It’s already bound to an FBO, since they’re rendering to it to compute its data.

B: They’re using Image Load/Store. In which case, they could be writing to a buffer texture, thus giving them much better access to the data (ie: being able to map the buffer for reading).

This is probably why the ARB hasn’t bothered to add such a function.

Taking the point of time it would have been senseful to add GetTexSubImage into account I strongly doubt that.
I’d have appreciated to see the “new extension” been written against the 1.0-spec

It would probably be faster to keep a copy of the heightmap data on the CPU and modify it there instead. Readbacks suck.

Maybe. Doesn’t have to be. But not an argument for not including it in the first place,

http://www.opengl.org/registry/specs/ARB/copy_image.txt the region of interest into a separate texture and then GetTeximage that?

Why would you do that? It’d probably be faster to just get the whole image and pick out the part you want.

Don’t forget that there’s extension for sparse textures which may be impractical to transfer as a whole. The bind to FBO/ReadPixels workaround may also fail, because FBOs usually aren’t capable to handle all texture formats, e.g. compressed textures.

Don’t forget that there’s extension for sparse textures which may be impractical to transfer as a whole.

Sure. But… if you wanted that data, you should have kept some of it, instead of deleting the memory after the upload. Asking for stuff you already sent OpenGL is a waste of time. Generally speaking, sparse textures are not used as render targets or the output of processes that compute information. Sure, they could be, but I can’t really imagine why one would want to.

The bind to FBO/ReadPixels workaround may also fail, because FBOs usually aren’t capable to handle all texture formats, e.g. compressed textures.

If it’s a compressed texture, then that data could only have gotten there in one of two ways:

1: You uploaded it. Again, if you wanted it, you should have kept it instead of deleting the memory.

2: You wrote to an unsigned integer texture and use texture copying to copy the bits into a compressed texture.

The likelihood of doing the latter case and needing to read from it on the CPU? It seems rather unlikely.

Plus, getting subimages of compressed formats is… difficult. Even moreso since glCompressedTexSubImage* themselves are only ever guaranteed to work if you upload the entire mipmap level. Otherwise, the implementation can throw a GL_INVALID_OPERATION error for arbitrary, unspecified reasons related to the format.

Sure it’s a waste of time - but not a waste of Memory as keeping every single texture in Memory two or three times. It is a little about what one wants to have and judge the means according to the specific needs. And this is something where to OpenGL API is a bit one-eyed in that the pre-assumption is that processing-speed is the only criterion that matters which refelects the state of the hardware as it was until maybe a few years ago - namely that there are tons of Memory available when compared to the computing power of the graphics Hardware and the Transfer Speed of the bus-Systems. In a not too timecritical environment I wouldn’t care about wasting some time when compared to a method that is super-efficient if it eased the implementation a lot. But requesting the whole tex-image when needing only a few pixels sounds like bad-taste to me. Assuming that some round-tripping of textures between System- and GPU-Memory will most likely take place anyways makes me faithful that the Driver isn’t all too likely to drop all texture-data from system-Memory so it will be accessible without too much of hassle - if there weren’t the API that doesn’t allow getting a SubImage…

namely that there are tons of Memory available when compared to the computing power of the graphics Hardware and the Transfer Speed of the bus-Systems.

I’m not sure why you brought that up, since it’s an argument for you to keep a copy of the image data. You claim it’s a “waste of memory” for you to keep a copy, but then claim there’s “tons of Memory available”. It can’t be both, so which is it?

Assuming that some round-tripping of textures between System- and GPU-Memory will most likely take place anyways

Why should we assume that? In the best-case performance scenario, that’s not supposed to happen. Textures live on the GPU and should only be evicted when there’s no more room. And since modern OS’s more or less guarantee a process’s GPU memory, textures aren’t ephemeral like they used to be. So drivers don’t need to keep backups of them around.

I see no reason to assume that every texture has a backup copy in system memory on a modern OS. I’m not saying it’s not true; I’m saying that we shouldn’t assume that it is.

That depends heavily on whether I’m sitting on my Desktop, my old Laptop or my Handy. The aforementioned assumption is none. My Laptop uses the system-Memory for gpu stuff anyways. There won’t be any difference. About my Desktop I’m pretty sure if I plug in a board that has 64mb Memory and I use about 90mb of textures. It would be pretty stupid to drop the textures from System-Memory if round-tripping is required. If it isn’t there’s still something one can have faith in - that is: the Driver developers trying to find a clever solution. If a texture is known to be read by api calls it isn’t alltoo complicated to flag it for some time and keep it. The spec - of course - will never include statements about such a assumption: It defines the semantics of the API - not the details of it’s implementation. The details of implementation get in when it is required to rule out incorrect usage which is not directly deduceable from the semantics themselves but due to certain hardware-restrictions etc. To avoid assumptions about such things would require reading notes from the individual Driver-developers - not the General OpenGL sites.

This likely does not apply to the poster’s original wants, but for Intel GPUs (no laughing here please), there is http://www.opengl.org/registry/specs/INTEL/map_texture.txt which allows one to map a texture (there are various limit-issues on what can be mapped though).

I think a glGetTexSubImage is not necessarily a bad idea, though one can get equivalent functionality via FBO and glReadPixels, it seems silly to do it that way. As far as it always being slow, one may want to dump it into a buffer object and go further from there… As a side note, something like that was done in GL2 days with pixel buffer objects to essentially simulate transform feedback.

It would be nice to be able to bind the pixel-data of textures directly to some buffer, separating it from it’s meta-data.
That is - a glTexImage that does not copy the data but simply makes it a binding. This way access to the texture data as well as storage-type-hinting etc. would follow the generic buffer-api.

A lot of things would be nice to be able to do. That doesn’t mean we can or should be able to do them. A good abstraction needs to actually be abstract, so as to allow implementations across a variety of hardware.

Just look at Intel’s map extension. They basically wave their hands about what formats are “natively supported by the GPU hardware”. Thus ensuring that you have absolutely no idea whether a particular sized format will be mappable. That’s not an abstraction.

This is true in that not all texture-formats are necessarily suitable for such a mapping. The spec as it is leaves a certain degree of freedom in internal representation for most formats which would mean the formats available for such a mapping would only be some if not using vendor-specific extensions.
I do not really understand the principle point you state at the beginning of your post. I guess I should be able to - as an example - render to a texture with a well-defined format and use the pixels as vertex attributes without the need to copy them. I do not know anything too specific about GPUs of course. But seeing APIs like OpenCL being widely available it simply cannot be that those things would not be possible to do. Of course I don’t know what you should do or shouldn’t - I’m not familar with the specific needs of your business. As a developer using an API one can wonder about the obvious absence of basic functionality or one does not - one size fits all is an Illusion in that a Piece of Software could dictate the users’ needs. What it can dictate of course is the useage-patterns resulting from it’s design. Again the question is if the API should be designed in a way that ensures optimal Performance at the cost of usability.

I guess I should be able to - as an example - render to a texture with a well-defined format and use the pixels as vertex attributes without the need to copy them.

I’ll ignore the question of why you would even want to do that these days with transform feedback, image load/store, and SSBOs available. So instead I’ll focus on how that would work.

OpenGL has no concept of a “well-defined format”. It has formats of particular pixel sizes. But that says nothing about the important questions of swizzling, internal storage row alignment, and so forth. So you want to now expand image formats into being able to answer and control these questions? How would that work? And what about hardware that can’t implement certain combinations of stuff?

But seeing APIs like OpenCL being widely available it simply cannot be that those things would not be possible to do.

I admit that I’m not exactly up on OpenCL, but I’m pretty sure that OpenCL images can’t do what you’re wanting either. The OpenCL concept of buffers is different from image buffers, just like the OpenGL concept of buffer objects is different from textures. You can’t shove an image buffer in OpenCL when a buffer pointer is expected, and vice versa.

So I’m not seeing your point.

As a developer using an API one can wonder about the obvious absence of basic functionality or one does not

The ability to pretend that an image is a buffer object is not “basic functionality” by any reasonable definition of that term.

Again the question is if the API should be designed in a way that ensures optimal Performance at the cost of usability.

Usability is in the eye of the beholder. And not being able to use textures for sources of vertex data is hardly limiting in terms of usability. And yes, a well-designed performance API should be designed in a way that prevents you from doing things that lower performance needlessly.

Also, this conversation is very confusing. We’ve gone from a fairly simple, not-entirely-unreasonable request to be able to read parts of images back to nonsense like binding images as buffer objects and handing OpenGL random pointers that it’s expected to use as buffers and images. These things have nothing to do with one another.

I would call this a wise decision.

It has formats of particular pixel sizes. But that says nothing about the important questions of swizzling, internal storage row alignment, and so forth.

You must have selectively forgotten about the sized internal formats. Given the cheap shot this does not include restrictions about row-alignments which seems to make things impossible to even image for you.

I admit that I’m not exactly up on OpenCL, but I’m pretty sure that OpenCL images can’t do what you’re wanting either.

What? Being a sequence of numbers that define colors? You must be kidding.

OpenGL concept of buffer objects is different from textures

If textures aren’t series of numbers in ones view - of course.

The ability to pretend that an image is a buffer object is not “basic functionality” by any reasonable definition of that term.

Given - that was aimed at GetTexSubImage which is alltoo obviously missing when seeing it’s pendant.

Usability is in the eye of the beholder. And not being able to use textures for sources of vertex data is hardly limiting in terms of usability.

Maybe that’s the case for you - which makes your statement a little too contradictory for my taste.

And yes, a well-designed performance API should be designed in a way that prevents you from doing things that lower performance needlessly.

In your notion of ‘needlessly’ you seem to cancel out the time it takes to write a code-path around those definition holes. The notion of performance API makes your standpoint even more clear. Maybe this is right for you - which seems a little strange as I had the feeling you were reading the GL-spec at breakfast so that a lack of knowledge about certain things could hardly be trap for you performance-wise.

And yes - one’s nonesense is other’s workpower-saving. Comes down to what you make of it.

You must have selectively forgotten about the sized internal formats.

I said “formats of particular pixel sizes”. Sized internal formats only describe the sizes of pixels, not the arrangement of pixels in memory. And without being able to control that, you can’t use them as buffer objects, since the arrangement of the data in the texture is not well specified by the API.

What? Being a sequence of numbers that define colors? You must be kidding.

Just because you believe that a texture is “a sequence of numbers that define colors” doesn’t mean that an API agrees, OpenGL or OpenCL. You can think of them as that all you want. That will not change the objective reality of the situation (FYI: they are not), nor will it change the objective definitions of OpenGL and OpenCL’s APIs.

Given - that was aimed at GetTexSubImage which is alltoo obviously missing when seeing it’s pendant.

… huh? It’s hard to have a discussion when you keep jumping back and forth between different points. Which idea for OpenGL functionality are we talking about: your desire to use textures as buffer objects, your desire to just hand them a pointer and expect textures to work with that as their storage, or your desire to read an arbitrary region from a texture? Because you’ve mentioned all of these in this thread.