Lock API

Alfonse_Reinheart · October 5, 2009, 10:30am

If you make the new API return an int64, it can directly return the GPU address, giving you all the benefits of the bindless extensions.

Except that we don’t want that low-level access to things. We want to maintain the abstraction, not break it.

You’ll still need MakeBuffer[Non]Resident. The driver can’t guess when you’ll need what buffers, you need to tell it.

But if you’re going to do that, you’re just locking the buffer with a different API call. You may as well just use the current behavior until you lock/resident the buffer.

The design of my API is to make the transition from what we have currently smooth, since the ARB isn’t interested in large rewrites of behavior. Changing the default behavior of buffer objects would likely break a lot of code.

elFarto · October 6, 2009, 4:06am

Your Lock API looks fine, it just feels like we’re working around another issue.

It certainly seems that VAOs need to be modified to not contain the reference to the buffer, but instead an index id. Then a set of buffers can be bound separately (this mirrors the DirectX 10 API).

OpenCL’s buffer management looks better. You can’t force it to reallocate the buffer, it’s one size for the entirety of it’s existence. It also seems like OpenCL has a better image API than OpenGL. If we could get an extension to access the fixed functionality of the GPU (rasteriser, interopolators, etc…) it would make a much better API (since it matches the hardware pretty closely).

Again, I really don’t think there’s anything wrong with your idea, I just think we’re making the hole we’re in even larger.

Regards
elFarto

Alfonse_Reinheart · October 6, 2009, 10:25am

It certainly seems that VAOs need to be modified to not contain the reference to the buffer, but instead an index id.

What is an index id?

Also, in order for this division to matter, we would need some evidence that part of the bottleneck is the setting of the vertex format. NV_vertex_buffer_unified_memory includes the GPU address as part of the VAO state. So it would require profiling the extension to see how much longer rendering takes when using VAOs vs. changing the vertex format as needed.

elFarto · October 6, 2009, 10:52am

Go read this, and specifically the D3D10_INPUT_ELEMENT_DESC page linked from it. The index id I’m referring to is called an “InputSlot”.

When you’ve setup the format of your slots, you call IASetVertexBuffers to attach a set of buffers to them. Kind of like the way uniform blocks work in OpenGL.

Regards
elFarto

Alfonse_Reinheart · October 6, 2009, 11:17am

I don’t see the need for such a remapping layer. Just because D3D10 does it that way doesn’t mean OpenGL should.

imported_Groovounet · October 7, 2009, 2:57am

I think you are a bit close minded on your ideas on this topic Alfonse. The thread is losing interest for my point of view.

elFarto didn’t link Direct3D because “D3D10 does it that way doesn’t mean OpenGL should”. He refers to this to give an example of some kind of ‘slot’ for buffers so that we could bind a VAO and freely change the buffers.

Bind one VAO, bind N*M array buffer and draw N times at least.

Alfonse_Reinheart · October 7, 2009, 10:20am

Bind one VAO, bind N*M array buffer and draw N times at least.

To suggest such a thing, one needs evidence that it is beneficial. The lock API has clear benefits, based on what I pointed out in the first post. As I suggested, there needs to be some benchmarking done to see if that would be worthwhile.

It should also be noted that NVIDIA had the chance to create this separation with bindless graphics. But they did not. They specifically made the GPU address part of VAO state.