I am essentially trying to solve the classic problem of converting from screen space back to view space. I have done significant research on this, and have looked over all the similar topics on here, but I am still unable to make progress on this. Say I have a point (0,0,0) in world space. When I click on this point in screen space with my cursor, it should read (0,0,0), and this is the problem, I am getting some other values like (1, 7, 100).
My strategy is the following:
Convert Mouse Positions to NDC
Obtain inverse of (Projection * View) matrix
Multiply 1 and 2 to obtain a resultant vector which gives me a result in homogeneous coordinates
Take the resultant vector and divide by the weight component
I feel that my issue is in step 1, I am unsure how to get the z value for NDC, I have just set it as 1.0 for now.
You can’t transform <x,y> to <x,y,z>. Given screen-space <x,y>, you can get the equation of a line through the viewpoint. If you want a specific point, including the Z coordinate, you either have to read the depth from the depth buffer or find where the line intersects the geometry which makes up the scene.
You’re passing GL_FLOAT but the variable is of type double. The call will store a single-precision float in the first four bytes (low 32 bits) and leave the last four bytes (high 32 bits) untouched.
Change zpos to float (you can’t retrieve double-precision values from glReadPixels).
I have one other question, I’ve noticed that if I rotate around the points (as in move camera around) the same point will have a different vertice value depending on my camera position. Is there any technique to preserve the original positions under any type of transformation?
Picking gives you a point in NDC, which you can then transform to any other coordinate system using the inverse of the appropriate combination of matrices (plus projective division). E.g. transforming by the inverse of the projection matrix will give you a position in eye space. Transforming by inverse(projection * view) will give you a position in world space and by inverse(projection * view * model) in object space.
How “not the original value” are we talking about? It’s 3 floating-point values; you’re never going to get binary-identical values. Are you getting something reasonably close?
I tried this, however, the values are very bizzarre and clearly incorrect. What you have said makes logical sense but I feel I may have I implemented the idea incorrectly, do you see any issues with this implementation?
No unfortunately the variance is too large. I understand that due to precision and estimation issues I will get back the (original value + epsilon), but this is not what I am getting.
You’re applying (most of) the inverse transformations twice.
glm::unProject transforms window coordinates to object coordinates, applying the inverse of the viewport, projection and model-view transformations. cursorPosition will be in object space.
The second line does roughly the same thing as glm::unProject except that it doesn’t include the viewport transformation, so you’d need to convert window coordinates to NDC yourself.
If you only need to transform a single point, you should probably just use glm::unProject. If you want to transform many points, using glm::unProject is inefficient as it calculates the inverse of the matrix each time.
Right, so that is what I had before, and it works as long as I don’t apply any camera rotations. Like you said glm::unProject returns the object space coordinates based upon how I set it up. However, regardless of whether I rotate the camera, the model coordinates should be the same. What I am thoroughly confused by is, how can the same point correspond to different values when the ray is hitting the exact same point, albeit from a different position?
Is it hitting the exact same point? How large is the difference in position?
A couple of things to consider:
First, the Y position should be SCR_HEIGHT - 1 - ypos. The top-left pixel will have mouse coordinates (0,0) but OpenGL window coordinates (0,SCR_HEIGHT-1).
Second, the X and Y components of screen_coords should probably be (xpos+0.5, SCR_HEIGHT-ypos-0.5). Those are the window coordinates of the centre of the pixel under the mouse cursor, which will be slightly more accurate than using the lower-left corner (which is what’s required for glReadPixels).
Also: how good is the depth resolution? If the near distance used for the projection matrix is too small, depth resolution will be poor for most of the scene. The near distance should be as large as you can get away with.