Some basic OpenGL questions

Hi All,

I’ve been looking into OpenGL programming over the past few days and there are some concepts I’m having trouble grasping. Are all of the following observations correct? I want to make sure I’m not missing anything obvious/not doing everything completely wrong.

  1. As far as I can tell OpenGL doesn’t do anything for you as far as creating objects or storing object data. It only draws whatever lines or triangles you tell it to, when you tell it. So all objects have to be redrawn if something needs to be updated. There are no methods for deleting drawn “objects”. The closest thing to saving the drawn screen is push/pop matrix.

For example, if I draw a box and then draw a circle, and then I redraw the circle translated some distance away, I will have to also redraw the box in the original position.

  1. All translations/rotations are done with respect to the 0,0,0 point. When something is drawn and translated, it has to be added to the current position.

  2. All positions are in respect to a percentage of the screen width. There is no way to offset by pixels.

  3. Translations and rotations are relative and accumulative. When I set glOrtho2D multiple times, they seem to be additive. For example, if you do glOrtho2D(-1.10,1.10,-1.10,1.10) it makes the screen 10% larger in all directions, but calling it again will make it 10% larger with respect to the current screen size, effectively making it 121% larger than the original.

Any help would be greatly appreciated. My question was already kicked from stack overflow for being too broad.

None of those are questions. But I will still correct some of the “observations”.

Nobody’s really forcing you to clear any images between frames, you get to keep them (some double-buffering details aside). But in practice, a 3D scene is redrawn from scratch every time.

Those operations, from a mathematical standpoint, have no explicit notion of a “0, 0, 0 point”; it can’t make sense any other way. But yes.

OpenGL’s normalized device coordinate space goes from -1 to 1 in all directions. The notion of a percentage is hardly applicable, but yes, most stages of rendering do not care about the output image’s resolution.

Of course there is, just multiply the offset with the relative size of a pixel. Whether that makes any sense to do depends on the context of what you’re doing.

When you call those functions, an appropriate projection/translation/rotation/etc matrix gets multiplied on top of an internal “current” matrix that is stored by the driver. (In fact, this “current” matrix is the top of the stack that is manipulated by the Push and Pop functions you mentioned). Observe any effects of this fact. Note that in modern OpenGL (which has been called that for, what, 15 years now?) it is heavily advised to do all of that math yourself, ignoring this functionality. (And also to simulate the matrix stack yourself, but it will likely turn out that you don’t really need it.)

Thank you very much. That is exactly what I was looking for.

That isn’t really accurate, IMHO. The contents of the back buffer (if you have one) become undefined whenever you swap buffers. The contents of the front buffer become undefined at the discretion of the window system.

It’s possible to update a scene incrementally so long as the front buffer doesn’t get invalidated (if and when this happens depends upon window system specifics such as compositing, backing store or save-unders), but ultimately you need to be able to redraw any part of the scene on demand if the region does get invalidated, and without double buffering you’re going to get flicker. Incremental updates used to be relatively common with CAD software with complex models, slow hardware, and no need for smooth animation, but it isn’t used much (if at all) nowadays.

If you want to perform reliable incremental updates with modern OpenGL, render to a framebuffer object (FBO). But they’re much less useful with perspective projections than orthographic projections (2D). One of the advantages of incremental updates with 2D is that you can translate the view by simply translating the rendered image, but that doesn’t work with a perspective projection due to parallax.

So normally each frame is drawn “from scratch”.

Correct. OpenGL is an “immediate mode” API, not a “retained mode” API like e.g. Inventor or VRML.

In modern OpenGL, coordinate transformations are the responsibility of the program. The matrix functions (glRotate etc) don’t exist in the OpenGL 3+ core profile.

In legacy OpenGL, most matrix operations (other than those with Load in the name) generate a matrix and postmultiply the current matrix (the one on the top of the stack) with it. I.e. M=M·N where M is the current matrix and N is the rotation/scale/translation matrix generated by the command. The effect is that the origin and axes of the transformation are those established by previous transformations.

After all transformations have been applied, the geometry is clipped to the signed unit cube [-1,1]3. This range then mapped to the current viewport. If you want to work in pixel coordinates, you typically set the projection matrix to an orthographic transformation which is essentially the inverse of the viewport transformation.

The pixel grid is ignored by all steps prior to rasterisation. Object coordinates use whatever coordinate system is convenient, these are then transformed to clip coordinates (either by the shader program in modern OpenGL or by the model-view and projection matrices in legacy OpenGL), which are converted to normalised device device coordinates (by dividing by the W component), then to window coordinates by the viewport transformation.

Correct; see point 2 above.

With modern OpenGL, the program is responsible for transformations (the GLM library provides equivalents for all of the legacy matrix operations). The vertex shader outputs clip-space coordinates; OpenGL doesn’t care how they were generated.

The fixed-function matrix transformations weren’t particularly useful in substantial applications as the program usually needed to perform those transformations itself for collision detection, culling, etc.

This is, strictly speaking, correct, but it’s also the case that the starting point for most chains of transformations is, by convention, an identity matrix: i.e. untranslated, no rotation, no scaling, no other transforms. There’s nothing in legacy OpenGL that absolutely requires this, of course, other than specifying the initial values of the fixed pipeline matrices (and that this initialization is when the context is created, not per-frame).