Allocating a buffer of more than 2GB?

Hi,
Is there any way to create a buffer of more than 2GB with OpenGL?
Given than most graphics cards usually have more than 4GB or even 8GB of memory, it seems weird to me that OpenGL reports a limit of 2GB for the following fields (using a recent nVidia card):

GL_MAX_TEXTURE_BUFFER_SIZE​
GL_MAX_SHADER_STORAGE_BLOCK_SIZE

Isn’t there a way to create a very large, single, contiguous buffer?

Thanks,
Fred

The size of a buffer allocation is not the same thing as the size of data that a shader can access. The former is the largest size you can give to glBufferData (ie: allocation size). The latter is governed by the limitation you’re talking about. You can (probably?) allocate more than 2GB in a buffer; you simply can’t glBindBufferRange all of it for SSBO usage.

Though its interesting to note that many Vulkan implementations allow you to use more than 2GB for SSBOs. I have no idea why.

Good question. If there is, I don’t know about it.

An interesting, related question is which desktop GPUs support indexing buffers on the GPU with 64-bit offsets and addresses.

Related to NVIDIA and 64-bit shader access into buffers, it appears there’s a tie-in with the presence of NV_gpu_shader5 support and shader 64-bit integer. This appears to be supported back to the GTX4xx days at least (Fermi).

NV_shader_buffer_store - 3/2010:

NV_shader_buffer_load - 8/2009:

NV_vertex_buffer_unified_memory - 6/2009:

Thanks for your replies.

Dark Photon, using the old NV* extension functions (good find!) may work, but this impractical in my situation.

Alfonse, you identified something I haven’t thought about. There are three things:

  • the allocation size
  • the glBindBufferRange maximum size
  • the size accessible by shaders

While the allocation size can indeed go beyond 2GB (hmmm, need to double check that…), and while I may work around the glBindBufferRange limitation (easy enough, by calling the function multiple times), doesn’t GL_MAX_SHADER_STORAGE_BLOCK_SIZE limit the maximum size my shader will have access to? For this, I can’t do anything, can I?
What do you think?

These are the same thing. When you call glBindBufferRange, you are saying “the shader can use this range of this memory”. You’re not allowed to call this with a size greater than what the shader can access.

No, you can’t.

If you call glBindBufferRange to the same index of the same binding target, you will be overwriting that index with a new buffer range. IF you use a different index, that means you have two SSBOs in the shader, since each index maps to a specific SSBO in your shader.

Ok, so in this case, assuming I somehow manage to allocate more than 2GB, how could I concretely map the buffer contents for updating? I don’t see, quite frankly.

All at once from a single SSBO in your shader? You can’t.

You can bind any 2GB portion of the buffer’s storage to any (appropriate) SSBO binding point. But the size of the range for any binding call must be within the limitation you cited.

OK, great, thank you, that seems like a workable path. I will definitely try that. I won’t avoid having two binding points, but I can avoid having two different buffers.
Thanks again Alfonse and Dark Photon.

I have just noticed that Intel HD adapters have a limitation of 128MB for GL_MAX_SHADER_STORAGE_BUFFER_SIZE and a limitation of 16 binding points. That’s a maximum of 16*128MB=2GB worth of memory accessible by a shader at a single time.
There seems to be nothing that I can do to access more than 2GB of memory from a shader on Intel adapters, which is a shame.
I noticed the same limitation applies for Vulkan.
The only way to use more memory from GPU code is basically to use OpenCL (and forget about rasterization of course).
:frowning:

Hmmm. I wonder…

From: https://opengl.gpuinfo.org/

I meant, “There seems to be nothing that I can do to access more than 2GB of read/write memory from a shader on Intel adapters, which is a shame.”
GL_MAX_TEXTURE_BUFFER_SIZE is 134217728 on my Intel HD Graphics 520 adapter, meaning I can access only up to 512MB of read/write memory through texture buffers.

https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_shader_storage_buffer_object.txt

Here is what the spec says:

“The total amount of buffer object storage that can be accessed in any shader storage block is subject to an implementation-dependent limit. The maximum amount of available space, in basic machine units, can be queried by calling GetIntegerv with the constant MAX_SHADER_STORAGE_BLOCK_SIZE. If the amount of storage required for any shader storage block exceeds this limit, a program will fail to link”

Also of interest, this remark from user BDO on stackoverflow:

“Since you do not declare the shader storage block with a fixed size, I wouldn’t know how the glsl compiler should check for this”

I’m not sure how that’s interesting. He’s just pointing out that the shader compiler cannot verify a limitation determined at runtime.

Ignoring GL_MAX_SHADER_STORAGE_BLOCK_SIZE, like the OP in the stackoverflow thread, I did some tests. Just to be precise, my SSBO buffer is just declared with a variable number of ivec4’s (i.e., { ivec4 data[]; } my_data)

  1. on an GeForce GTX 1080, I can indeed allocate a 3GB buffer, mapBufferRange it, and read the buffer in my shader. ‘int’ indexing is okay through my_data.data[index][offset] and allows me to read the whole 3GB (that’s the advantage with ivec4’s, obviously, i.e. to access up to 8GB)
  2. on an Intel HD Graphics 520, I tried to do the same with a 1GB buffer, and it works. It failed with an OpenGL out of memory error 1285 with a 2GB buffer when calling glTexImage2D, so that’s an unrelated error.

So this makes me think either that

  • we here do not understand the limitations and spec of values such as GL_MAX_SHADER_STORAGE_BLOCK_SIZE
  • the OpenGL spec is either unreadable or impossible to understand
  • both Intel and nVidia do not honor the limitations advertised by their implementation, which seems unlikely to me
1 Like

After seeing my answer to the question you linked to, I did some digging into the standard. I found “Table 6.5: Indexed buffer object limits and binding queries”. In the section of the table for SSBOs, it specifically says:

size restriction: none

As such, what you’re doing is perfectly valid. It seems that the size restriction is only for the static size of an SSBO’s block definition… in OpenGL.

It should be noted again that Vulkan does have a specific limitation on the buffer range you can use. So it seems clear that despite the OpenGL specification, there is a hardware limitation on the appropriate range.

If it’s at all possible, I would suggest doing something weird. Check what the Vulkan limitation is (just by querying info from the VkPhysicalDevice; no need to create a VkDevice) in your OpenGL program and adhere to that.

I saw the discussion on github as well:

I did a few more tests on my Intel laptop:

  1. maxStorageBufferRange is 4294967295 on Vulkan (interestingly, maxTexelBufferElements remains at 134217728, but we don’t care here)
  2. I tried to compile and link a program with a huge static SSBO array size (eg. { ivec4 data[100000000]; }, which is about 1.6GB of data) and it succeeds. glLinkProgram just takes an enormous time to complete, but does not complain at all. This means that the Intel implementation does not even look at GL_MAX_SHADER_STORAGE_BLOCK_SIZE… I wonder where this limit is looked at in their code…

Do you know what PDaniell means by “Accesses beyond this limit are covered under the out-of-bounds access behavior defined in the GL spec”?
How can I deterministically, programmatically check that buffer accesses are working from the shader? Shall I enable robust buffer access? Theoretically, if I enable robust buffer acces, the implementation must return 0 for data read operations beyond either specification or hardware limits, right?

On a GTX 1080 with drivers 471.41, attempting to compile and link a fragment shader with:

{ ivec4 data[67108864]; } my_data;

causes a crash. The glLinkProgram call does return, but calling glGetError afterwards (or glGetProgram with GL_LINK_STATUS) results in a crash. I am sure no errors are reported before because I am doing a glGetError after every single GL instruction in my code.

67108864*16=1,073,741,824 (1GB of data). Doing this should work, because the nVidia driver reports GL_MAX_SHADER_STORAGE_BLOCK_SIZE=2,147,483,647.

I just tested with robustness enabled in the GL context, and it does not change anything.

Similar behavior here on NVIDIA, though for me glLinkProgram() hangs for 28 seconds and then trips an Access Violation down in the driver.

I get the same behavior whether I try to shovel this high of an explicit array dimension (64M * 16 = 1GB) into a UBO or and SSBO (…ignoring whether it’s even valid to do so).

It does seem like the NVIDIA GLSL compiler should probably trip an error here rather than just hang+crash in glLinkProgram().

1 Like

I do all my reading from a giant QUAD primitive, processed through a glDrawElements(). I am just making sure I don’t go beyond the various limitations involved (GL_MAX_TEXTURE_SIZE, GL_MAX_VIEWPORT_DIMS, GL_MAX_RENDERBUFFER_SIZE and the likes).

Reading a 3GB SSBO from a shader seems to be no problem on nVidia hardware.

When reading a 5GB SSBO, all creation steps succeed (buffer allocation, shader compilation and linkage), glDrawElements is called and returns normally. The shader reads unexpected data beyond the 2GB limit (exactly). This makes little sense to me as reading a 3GB buffer, as I said, is fine from start to end.