What does the GPU do between the last call to gl.draw() and gl.readPixels()?


I posted this to stackoverflow but got no answers. This is still a mystery for us. (https://stackoverflow.com/questions/52124738/what-happens-in-the-gpu-between-the-call-to-gl-drawarrays-to-g-readpixels)
We’d like to know if we have wired something incorrectly in terms of the API calls which is causing this in-explainable delay?
Even if we read only one single pixel after our last call to gl.finish(), we still see a huge amount of time elapsed which is larger than the total time for the whole program up until gl.finish().
If we let the CPU do some idle time from the last call to gl.finish() to the first call to gl.readPixels() then the gl.readPixels() returns almost instantaneously.
To draw this:

// start timer
// upload data gl.texImage2D

// bind uiniforms and and textures

// draw() + gl.finish()
// record elapsed time: a few ms
//bind texture to framebuffer
// readPixels (a trivial amount)
// record elapsed time: thousands of ms!