Ok, so the divide by w has to be performed by the perspective projection. The viewport transformation ‘expects’ the input to be in NDC otherwise it wont work.
In that case, does the following perspective transformation produces Normalized Device Coordinates?:
void perspective(GLfloat *matrix, GLfloat fov, GLfloat aspect, GLfloat nearz, GLfloat farz)
{
GLfloat range;
range = tan(fov * 0.00872664625) * nearz; /* 0.00872664625 = PI/360 */
memset(matrix, 0, sizeof(GLfloat) * 16);
matrix[0] = (2 * nearz) / ((range * aspect) - (-range * aspect));
matrix[5] = (2 * nearz) / (2 * range);
matrix[10] = -(farz + nearz) / (farz - nearz);
matrix[11] = -1;
matrix[14] = -(2 * farz * nearz) / (farz - nearz);
}