vkCmdFillBuffer and it's use case

Hello,
I was playing with learning the stuff I can do with buffers, and I tried the fill buffer command.
But I got odd results. Maybe because I was using Python, but…

When I made the VkBuffer I preformed the fill on, I passed in a numpy.array of type float32 as src for the memory move(memcpy). Then went and did another move/copy to a new zeroed numpy.array of float32 and printed that to the console to test if the VkDeviceMemory and the initial np.array match. And they did. So I tested the fill command and I noticed that in the spec it says that it only uses uint32_t for data so I gave it that.
Well when I moved/copied the VkDeviceMemory again, I got 0s where I specified the fill to happen. Thus I tried to figure it out what happened, and long story short, if I set an array of int32 as the dst of the move I can see the int values of the fill, but then the floats are all messed. If I use a float32 array I get my floats but the fill is just 0s(or actually a super small decimal number). I suppose in terms of bits the data is there. i was printing the result in bits and I got my 1s and 0s. Now to my knowledge ints and floats might be 4 bytes but their bits are ordered differently, thus I was wondering:

Do vkCmdFillBuffer work only with data that will be read as int32?

And in the case of a buffer that will be read as float32, is there any use case except filling a desired area with 0s .

Also considering that we can map a specific region of a buffer’s memory and directly move a prefilled buffer with the number we desire, of the type we desire…is there any use for Fill Buffer?

VkBuffer is an array of bytes. Nothing more, nothing less. It is untyped for all intents and purposes. It will read back whatever you write to it.

You don’t have to “figure out what happens”. The specification already tells you what happens.

Mapped memory requires mappable memory. Writing to mappable memory (bar some HW bus compression) requires everything to move through the bus. vkCmdFillBuffer does not require mappable memory. Replicating single value over the whole buffer might not require to move it N times over the bus. Additionally, vkCmdFillBuffer is async\recorded command. It permits execution to stay on the GPU with no interaction with the CPU. For example if there are two uses of the buffer in sequence that each requires the buffer to be zeroed at the start, then two fills can just be submited to the queue. Doing the same with mapped memory would require synchronization with the host.

Thanks @krOoze , I think this bit sums up nicely :

The error in my understanding was that buffers have types. But now that I look at it and it totally makes sense. And explains my results. Actually I reached that conclusion myself with my “figuring out”. Just needed a confirmation.
In that light yes, the spec tells me what happens(although not explicitly). But it makes sense only if one knows that the Buffer is untyped. A bit in my defense, the spec says for VkBuffer:

Nowhere it says it’s untyped or just bytes. I suppose the part where it says “used for various purposes” is hinting that. But now I know so thanks for clearing this out.

On the use cases, looking from your comment can I assume that the best use case is to use vkCmdFillBuffer for zeroing parts of a buffer. Since the only consistent value in terms of bytes or bits, for floats and ints is 0?

Thanks for explaining a use case where vkCmdFullBuffer will be better than mapping the memory.

A specification generally does not define things by the negative. It does not say all the things something isn’t. But it says directly what something is. If it was typed, the specification would say so, and there would be a parameter to vkCreateBuffer to type it (but there is only byte size). Nevertheless there is this:

Buffers are essentially unformatted arrays of bytes whereas images contain format information, can be multidimensional and may have associated metadata.

Yes, it can be used for zeroing the buffer, as much as oneing it, debug pattern it, or whichever else can be achieved by repeating uint32_t values.

The only limitation is that the buffer need to be multiple of 4, resp. the excess size is unreachable by the command.

2 Likes

Ah, I suppose when I read the spec from the pdf I glossed over that tiny bit and it didn’t stuck.
And when I was again revisiting the spec I looked at the VkBuffer spec page that I get when searching the structure in the search engine. Somehow the full spec page just doesn’t load for me, sometimes it does, but most of the time the page hangs in “loading” and instead of waiting I go to to the page I get from the search engine, since it’s faster to get. I guess I had the wrong impression that the proper description of what a VkBuffer is, won’t be outside of that dedicated page to the structure. I’ll take note to check the full spec as well, from now on.

Anyway thanks for clearing things out, and pointing to good uses for the fill. I actually found it quite useful in reusing a buffer for my exercises.