Which is faster?

I am currently working on object movement in my scene. Would it be faster to use opengls translation, and rotation functions to move objects around, or would it be faster to implement my own? Currenly I move objects by CPU power. First moving them to the origin, rotating them with a sin cos equation, and then moving them back to there original location, plus the movement of the latest move.

Im not sure, if the opengl functions are actualy calculated on the vid card, or on the cpu. Thanx in advace.

You probably won’t notice a difference either way. OpenGL generates the translation and rotation matrices on the CPU.

Interesting, Then I wonder why people use it? I mean, if you are looking at a pure clock by clock count of effiecy. Like the reason I decided to stay with doing it on the cpu. Take for instance, a model of an airplane. Say it has 18,000 Polys (i know thats high, but this is purely hypothetical). Now if you have to move the plane to the same location each time, with opengl that is done EVERY frame, if you do it on the CPU, and ACTUALY change the verts, its done once. That seems smarter to me.

I think we have a disconnect here. Nitro meant that GL generates its matrices on the CPU. The transforms are either done in HW or SW. Most modern cards are doing the transforms of the vertices on the card. The reason most people don’t transform their objects and cache the transformed version is that the eye point or other properties are changing. Since it is as cheap to transform from object space to clip space as it is to transform from world space to clip space, it is pointless to pay the extra book keeping and data around. On the other hand for doing things like deformable objects, it is often desirable to do them on the CPU, since you may perform more interesting algorithms on them.


I decided to do all my transformations in software and then use glMultMatrix. If you use OpenGL to transform your matrices it’s said that retrieving the matrix is slow. I should probably do a benchmark to see if doing transformations in software is faster than transforming with OpenGL and retrieving the resulting matrix.

Hmmmm, I think I see what you’re saying LostInTheWoods. You’re telling us you’re transforming your real vertex position. Here’s an example of why you wouldn’t want to do it that way. Lets say you have a world full of trees. With your method you’d have to either A: transform your tree into the world and back for every tree, or B: have several tree models transformed into the world.

If you Choose A, you’re transforming twice for each tree wich is slower than just letting OpenGL do it once. If you choose B, you would be using to much system memory. So the best choice is to let OpenGL take care of everything.

You never want to actually modify your model vertices unless you’re deforming sections of the model with a skeleton.

No, we had this discussion before, and lots of people benchmarked it - it’s slower to use opengl’s multmatrix than a home grown optimised one in your own exe, and this is probably due to the api function call overhead (calling a function in another dll).
Grow your own is the moto.

I don’t understand why you want to move the object “from somewhere” to the origin.

Generate and store all your objects based on the origin. Then generate a matrix which will rotate an object around the origin, and translate it out to its position. Load this matrix into the modelview before you draw your (origin based) object.

Update the input parameters (“facing” for rotation and “position” for translation) every frame, and re-generate the matrix every frame, if one of them changed.

Here is how i currently do it. Let me know if this sounds too intensive to work well. I currently actualy create each item. Meaning, each item has unique verticies, I find this best, because I can do faster collision detection (with my algorithm any how) and i am also implementing an emboss style bump mapping. And if the model is actualy at the point it is located at, i can cut out alot of the math work involved in bringing the light into model space.

I dont know if having a seperat model set for each item will use to many system resouces. I mean, take for instance a level in Quake3, they can have a max of 100,000 polys just for the level itself. That is ALOT of float values. Know you add in the characters and such, and you wind up with ALOT of polys, and memory. I am currently designing my game to be VERY interactive, meaning I want to be able to shoot, and move EVERYTHING. Being that im doing this i find it easier to work with real verts instead of projected ones.

My only problem i see is that i might be using too much memory. But, how many floats can fit in a meg of ram anyhow?

EDIT: Question, what is the memory difference if you load 100 squares on the screen,or if you have opengl project 100 copies of the same square? Dosnt have info have to be located somewhere? Or is it just created then instantly destroyed?

EDIT: Ok i just did the math, something like 260,000 floats will fit into one meg of ram. Now on a halfway decent machine (that will even play a game like this) we are looking at 128 megs. Thats about 33 million float values. That is ALOT of floats. I dont think i could even reach a forth of that number by making all my verts real, even if i did 3 LOD settings, along with a collision set. But what do you think?

[This message has been edited by LostInTheWoods (edited 10-21-2002).]

[This message has been edited by LostInTheWoods (edited 10-21-2002).]

if i understand correctly whats wanted heres my guess.

if u have an object in objectspace like whatever mentioned a tree. say u wanted to draw this tree 1000x (a small forest)

i can imagine it will be quicker to just keep one copy of the trees vertices + then before each tree just do a glMultMatrix on todays hardware, though of course if your app needs vertices in worldspace anyways eg for collision detection u might want to convert all the trees verts to worldspace + then just send those (ie dont do a glmultmatrix)
…aint this in the faq?

(btw normally u seperate collision detection from rendering)

Forgive me if you already know this, LostInTheWoods, but OpenGL does transformations like this. If you draw a triangle with glBegin(GL_TRIANGLES), each time you call glVertex, each vertex is transformed by the current MODELVIEW matrix and stored into a cache until 3 vertices are entered. That cache is replaced every time a new triangle is inputed. (I know that there is a bigger cache than that but I wanted to make it simple)

So that means the only place your vertices reside is in system memory.

If I read that correctly, zed, you actually want to transform the ray for collision detection from world space into model space, not the model vertices to world space.

Storing and updating every mesh instance in world space doesn’t seem like a winner to me. If things move a lot (and they should, for an interesting world!) you’ll be touching all that data. Every frame. The speed of your program is usually inversely proportional to the amount of memory it touches.

I suggest you re-code your collission detector to take object-to-world matrices for each thing you’re colliding against. All the “big name” engines do it that way (and some of the smaller ones, too :slight_smile:

I fail to see how doing any work (translation / rotation) on the CPU could help. If you think about it, OpenGL (=> the video card) will always do the transformation, so if you do some world operations on the CPU, it will do the transform anyway; hence you’re doing twice the work for nothing!

  • translate/rotate objects in world space on CPU
  • transform the vertices on GPU


  • generate the world transforms matrix on CPU
  • transform the vertices on GPU


Generate and store all your objects based on the origin. Then generate a matrix which will rotate an object around the origin, and translate it out to its position. Load this matrix into the modelview before you draw your (origin based) object.

But you need to apply the camera matrix first so you can’t just load the matrix of the object.You’ll have to multmatrix it with the camera matrix won’t you?Is there a better way?

[This message has been edited by zen (edited 10-22-2002).]

Normally you use the matrix stack. At the beginning of the frame you load your camera matrix into the modelview matrix. Then push it on the stack. Now for each objekt you create your object to world matrix (or you stored it) and multiply it into the modelview matrix. Render your object and pop back teh unchanged camera matrix.

If you have a hierarchical structure this is also very handy.
For example you create a matrix which positions your spaceship, multiply it into the modelview and push it on the stack. Then draw it, and for each turret on the spaceship you multiply the current turrets matrix into the modelview, render it and pop the original back.
After all turrets are rendered you pop back your original camera matrix.
The matrix stack should be very fast (i think it is hardware accelerated on some hardware).


Ok, this is my revised work i have been doing since last night. Tell me what you think.

All my like objects share a common set of opengl verticies, and normals, and such. The object is based at the origin. When the object is rendered, it is rotated and pushed out to the point as needed. (Thus cutting out all the Duplicat data). The one thing i have left as unique for ALL the individual objects is my collision detection models. I havent reworked this yet, i want to know what you all think about this one.

I feel there is a trade off here. I can either,

1: Have a unique set of vert information for each collision model. Basicaly a small array of variables for each model. (Collision triangles, and vectors and such). And move those as needed, when the object moves.


2: I could simply use the actual verts of the item, PROJECTED out into space, by way of simply rotating and translating each point out each time and using that temporary point to do my work.

In the first situation, i use memory, lets say for a 1000 poly model, i have a 200 poly collision model, thus 200*3 + however many normals of floats for EACH model.

In the second situation, i would use more CPU power. Because EACH time I wanted to test an object, I would have to move its verts.

Not sure if i understood everything, but in 2) you actually proposed on doing the transformations temporarily to do the collision tests, am i right ? Then, think about this: couldn’t you, instead of keeping your player at its current location and transform all the vertices of the objects, UNtransform the player’s position in respect to the tested object ?

That’s what people suggested when they said to move your player to object space. You don’t translate / rotate the object’s vertices into the world; you UNtranslate / UNrotate the player’s position into the object’s local space.


[This message has been edited by Ysaneya (edited 10-22-2002).]

NOW that is an idea i never thought of. Only problem i forsee is object object collisions, and possible object world collisions. But the world collisions will actualy be quite easy to convert over. And object object will happen so infreqently per frame, that I dont think it will matter that i have to project one. Thanks for the idea.

You guys rock.

Originally posted by WhatEver:
If I read that correctly, zed, you actually want to transform the ray for collision detection from world space into model space, not the model vertices to world space.

some of us do tri->tri collision detection, none of this pansy ray->box stuff
sorry though collision detection was a bad example

though what youre saying is a good idea, esp for ray->model things eg backface tests from a certain direction or dot3 lighting.

personally i first convert everything into worldspace every model every frame (unless theyre flagged as already being in worldspace)
it gets worse after that i convert every worldspace vertice into cameraspace everyframe. though this is an extreme example it just suits the program im making.

nowadays u would be better off offloading a lot of the stuff that i do on the cpu to the gpu (use vertex_program) im not to sure what calculations are accelerated but pretty soon im sure even mat44->mat44 ones will be. but im old and have gotten used to doing everything on the cpu