The GL commands you send are not processed syncronously. The card can render stuff whenever it wants.
To guarantee all GL commands have been actually processed, call glFinish();.
So to benchmark, you should only measure time between glFinish pairs.
EDIT: but of course this comes at the price of loosing some parallelism between GPU and CPU. glFlush(); tells the GPU to process commands but returns immediately.