Hi!
I have posted on NVIDIA forums about this, but wanted to ask here as well: is the memory size for vkAllocateMemory forcibly truncated to 32 bits on Windows, and if so, why?
We use Vulkan for processing large amounts of data for scientific applications. The data sizes are usually much more than the VRAM size. Usually we work with systems that have 64-128 Gb of RAM and 8-12 Gb of VRAM, and the data volume for processing is usually about 32-64 Gb. Some algorithms require all the data to be accessible to the GPU at the same time. Obviously, such access is done via PCIe bus, which is not very fast, but it is acceptable for our purposes.
To achieve such behavior in CUDA we formerly used pinned memory and everything worked fine. Now we have moved to Vulkan and tried more or less the same approach - allocating aligned memory in RAM by OS functions and then using VK_KHR_external_memory extension to make it accessible to the GPU via the PCIe bus. To access this memory from the compute shader we use uint64_t addresses (provided by VK_KHR_buffer_device_address extension), so the buffer range limitations are not a problem for us.
On Linux this approach actually works fine and we can access any amount of memory up to the total amount of RAM available, although the validation layers give the following message:
vkAllocateMemory(): pAllocateInfo->allocationSize (7516192768) is larger than maxMemoryAllocationSize (4292870144). While this might work locally on your machine, there are many external factors each platform has that is used to determine this limit. You should receive VK_ERROR_OUT_OF_DEVICE_MEMORY from this call, but even if you do not, it is highly advised from all hardware vendors to not ignore this limit.
It is our opinion that, logically, the allocation size limit should not apply in this case (that is, the validation layers should not report anything), since we’ve already allocated the memory, and want to merely map it for access from the GPU.
On Windows we observed a rather strange behavior. A maximum of 4 Gb memory can be imported, and the allocation size seems to be truncated to 32 bits. For example, if 7Gb of memory is allocated and then imported, the vkAllocateMemory function called for memory import returns no error. However, only 3 Gb can then be accessed from the GPU. If 5 Gb of memory is allocated, only 1 Gb can be accessed, etc. That makes us think only the lower 32 bits of the requested buffer size are taken for import size, and the upper 32 bits are ignored, despite the type for allocationSize being uint64_t. It seems unrelated to maxMemoryAllocationSize in our opinion (although maxMemoryAllocationSize is also 4 Gb in our case). We are aware that on Windows only 50% of RAM can be mapped to the GPU, and the amount of memory we’re trying to allocate for our tests does not exceed this limit, so the 50% limitation is not related to our problem either.
On Linux (Ubuntu 22.04) we use NVIDIA 570 proprietary drivers. On Windows we use the latest Game drivers installed by NVIDIA Center (as for 29.04.2025, version 576.02). Are you aware what’s the reason for this behavior? Can anything be done about it?