What the spec says is:
The command
enum GetError( void );
is used to obtain error information. Each detectable error is assigned a numeric code. When an error is detected, a flag is set and the code is recorded. Further errors, if they occur, do not affect this recorded code. When GetError is called, the code is returned and the flag is cleared, so that a further error will again record its code. If a call to GetError returns NO_ERROR, then there has been no detectable error since the last call to GetError (or since the GL was initialized).
To allow for distributed implementations, there may be several flag-code pairs. In this case, after a call to GetError returns a value other than NO_ERROR each subsequent call returns the non-zero code of a distinct flag-code pair (in unspecified order), until all non-NO_ERROR codes have been returned. When there are no more non-NO_ERROR error codes, all flags are reset. This scheme requires some positive number of pairs of a flag bit and an integer. The initial state of all flags is cleared and the initial value of all codes is NO_ERROR.
The wording is identical in 1.5 and 4.4, so I would assume that it’s the same in all versions in between.
Also, it doesn’t say that the implementation must actually be “distributed” to behave like this, or even what that term means.
Sort of, except that it’s potentially one error per “unit”, which isn’t defined, and there’s no glGet() to determine the number of units. So in practical terms, it will record the first error, plus some (effectively random, possibly empty) collection of additional errors. When you call glGetError() in a loop until it returns GL_NO_ERROR, it will yield the collected errors in no particular order. After which, the next error to occur (“next” in terms of the sequence in which the GL commands are issued, not necessarily executed) will be among those reported by the next “batch” of glGetError() calls.
I suppose that would depend upon your definition of “major”. I would assume that the language about distributed implementations was included for a reason. The most likely candidate would be the SGI systems where the geometry engine was a bank of i860s, but that’s not what most developers would consider a “major” platform nowadays.
As for typical PC hardware … I have no idea. I could conceive of implementations having separate CPU-side and GPU-side error flags. If it’s possible for errors to be flagged by both the CPU and GPU, then there’s no reason for the implementation to concern itself with which one happened first, given that the spec explicitly doesn’t require this.