That’s the perspective division, which is neccesary, well, because we need perspective correction. The value output from the vertex shader is (x, y, z, w). Then you divide by w and get the normalized coordinates (x / w, y / w, z / w). x / w and y / w will be in [-1 … 1], as will z / w which is used in the depth test.
Unrelated, but a performance tip you should keep in mind:
Your simple line:
gl_Position = gl_ProjectionMatrix * cameraModelViewMatrix * gl_ModelViewMatrix * gl_Vertex;
will do two matrix * matrix and one matrix * vector multiplications which takes 16 + 16 + 4 instructions. Bad.
If you bracket the code so that it resolves to only matrix * vector multiplications like this:
gl_Position = gl_ProjectionMatrix * (cameraModelViewMatrix * (gl_ModelViewMatrix * gl_Vertex));
it will only take 4 + 4 + 4 instructions!
Saved 24 instructions in a single GLSL line. Nice, eh?
And no, a compiler can not do that automatically because the multiplication operator associativity is from left to right.