Recording Command Buffers every frame slow as hell

Twanks123 · May 30, 2016, 5:27am

Hello guys,
i implemented Viewfrustum-Culling in my rendering-engine with vulkan and recognized in order to have this to work i need to constantly rebuilding my command buffers.
The whole time i pre-recorded them but as far as i know this is not possible with viewfrustum culling. I have to constantly check if every object is within the viewfrustum, right?
It works, but the performance becomes ridiculous bad the more objects i have in the scene. (for example ~100 objects 100 FPS and without rebuilding ~4400 FPS)
Do you know what im doing wrong? Do i need a separate thread for rebuilding the command buffers, because it seems like the cpu is the bottleneck and not the GPU?

Thanks in Advance

ratchet_freak · May 30, 2016, 7:25am

[QUOTE=Twanks123;40308]Hello guys,
i implemented Viewfrustum-Culling in my rendering-engine with vulkan and recognized in order to have this to work i need to constantly rebuilding my command buffers.
The whole time i pre-recorded them but as far as i know this is not possible with viewfrustum culling. I have to constantly check if every object is within the viewfrustum, right?
It works, but the performance becomes ridiculous bad the more objects i have in the scene. (for example ~100 objects 100 FPS and without rebuilding ~4400 FPS)
Do you know what im doing wrong? Do i need a separate thread for rebuilding the command buffers, because it seems like the cpu is the bottleneck and not the GPU?

Thanks in Advance[/QUOTE]

Or group objects into “tiles” and prerecord those tiles in secondary command buffers. Then you frustum test against the bounding box of the tile.

krOoze · May 30, 2016, 7:32am

So, is the CPU a bottleneck or not? Profile.

BTW, how do you do that? Are you using some hierarchical data structure at least? Do you do it for all triangles or only for objects?

It should be possible to prebake command-buffer in a lot of cases. It’s just annoying sometimes. Not really tried that much yet… You can control a lot by (not) submitting a CMB. Also by changing those parts that can change (memory, etc.).

Alfonse_Reinheart · May 30, 2016, 7:43am

Also, when you took performance measurements, did you turn off your debugging layers?

Twanks123 · May 30, 2016, 7:46am

[QUOTE=krOoze;40310]So, is the CPU a bottleneck or not? Profile.

BTW, how do you do that? Are you using some hierarchical data structure at least? Do you do it for all vertices or only for objects?
BTW lot of those can be done on the GPU too, if you need the speed.

It should be possible to prebake command-buffer in a lot of cases. It’s just annoying sometimes. Not really tried that much yet… You can control a lot by (not) submitting a CMB. Also by changing those parts that can change (memory).[/QUOTE]

I do it for every object in the scene. Yet i have one command buffer for each swapchain image, but control whats getting rendered by (not) submitting a CMD seems a good solution to me.
I will try to use a CMD for every object in the scene and only submit those who are visible. This could work and all CMD are prebaked. Or i group objects into tiles, make a CMBonly for those and check them against the frustum.
What do you think? Sascha Willems made in his “Multithreading” example with viewfrustum-culling one command buffer for each object. Is there any disadvantage to have thousands of CMB?

Twanks123 · May 30, 2016, 8:04am

I totally forgot about that! I tested it now and in release mode + debugging layers disabled i just loose around 100 FPS with rebuilding the Command Buffers in a fixed time-interval (60x per second).
For the time being i will leave it at that. Loosing 100 FPs is not a big deal.

ratchet_freak · May 30, 2016, 8:50am

[QUOTE=Twanks123;40313]I totally forgot about that! I tested it now and in release mode + debugging layers disabled i just loose around 100 FPS with rebuilding the Command Buffers in a fixed time-interval (60x per second).
For the time being i will leave it at that. Loosing 100 FPs is not a big deal.[/QUOTE]

invert the fps numbers please, milliseconds per frame is a much more direct and intuitive number to reason about.