Unexpected GPU Speedup when CPU is Fully Loaded on mobile devices using Vulkan

Hey everyone,

I ran into a weird performance issue while benchmarking a Vulkan-based dense matrix multiplication program on my OnePlus phone (with Adreno ™ 740 GPU).
I wrote the program in Rust using the Vulkano library. it was a very naive dense matrix multiplications program.

I used unified shared memory between CPU and GPU.

Here’s the setup:

I first ran the benchmark on the GPU alone, measuring the execution time.
Then, I ran the same GPU code while keeping all the CPU cores at 100% utilization (I spawn some threads, and the CPUs are also doing some tasks) and recorded the time*.

for the numbers, I ran the test multiple multiple times, and the numbers are true.

  • GPU alone: ~25.44 ms
  • GPU with CPU cores at max : ~9.78 ms
  • That’s about a 2x speedup in the when both CPU and GPU are running.

I was expecting the performance to be worse when the CPU was fully loaded since both the CPU and GPU would be busy.

Anyone have any clue why this might be happening? Is this something related to how Android handles CPU/GPU workloads? Or could it be some Vulkan quirk on mobile hardware?

Any insights would be much appreciated!