Use of C++ structures with Opengl?

Ok, this is my setup.

//Structures
struct PointStruct
{
float X, Y, Z;
}

struct VectorStruct
{
float X, Y, Z;
}

//Then I declare an array for my data.
//They are pointers to arrays dynamicaly
//allocated later on

int *Index;
PointStruct *VertexArray;
VectorStruct *NormalArray;

//Then I use my variables like so.
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_FLOAT, 0, &VertexArray);

glEnableClientState(GL_NORMAL_ARRAY);
glNormalPointer(GL_FLOAT, 0, &NormalArray);

glDrawElements(GL_TRIANGLES, 0, GL_UNSIGNED_INT, &Index);

Will that work? I mean, if i create a structure for my Points and vectors like that, is the information stored cronologicaly like if I had simply created just an array of floats??

It will work if you remove the &-sign in gl*Pointer. With the sign, you’re passing the address of the pointer, but OpenGL expects the pointer itself.

Alright thanks, i have been working on an engine for a while, and just realized how many times im doing the same thing over and over, so i am simplifiying my life. lol. I will do that, thanks.

glVertexPointer() copies the POINTER VALUE that you pass into it.

When you later call DrawElements() (or similar friends), GL will read memory from the address that this pointer value specified.

If you change the values of the pointer that you pass into glVertexPointer() after you’ve passed it into glVertexPointer(), there’s no way that glVertexPointer() can know about that change, as it copies the VALUE of the pointer, just like with any function argument in the C calling convention.

What you’re doing is dangerous!!!

You can’t make any assumptions on how the structures you defined are aligned in memory. What you’re doing might work with one compiler but not with another. (there might be gaps between each floats in your structures)

You should simply make an array of floats.
If you wanna use C++, you should use std::vector<GLfloat> and use &(MyVector[0]) to get the address of the first element for glVertexPointer() and other functions.

Quite true GPSnoopy. You should use sizeof(VectorStruct) for the stride, not 0.

IF you specify 0, it assumes (size*type), but the compiler might’ve put extra padding in there.

Nutty

Mmm, if the stride didn’t equate to 3 floats, then GL would have to unpad the data, which would make the driver copy a lot slower, no?
I’m sure there are compiler pragma’s for not padding specific structures?

Just curoius, how many compilers today will pad those structures? Not may I guess.

knackered, why would it be a lot slower? Given the start address of the vertex array, it’s pretty easy to calculate the address of any vertex. Just multiply the index of the vertex by the size of a vertex and you have the offset. If there’s padding between vertices you just multiply the index by another value (size of the vertex + padding). It doesn’t involve any extra work. There may be some problems with the cache though since the size of a vertex is larger (including the padding), but certainly not a lot slower.

C is a systems programming language. As such, each compiler defines the alignment of structures well, and/or allows you to specify how to do alignment. Not doing so would make C useless as a systems programming language.

Thus, as long as you know what you’re doing, using structs for vertices is absolutely safe and absolutely the right thing to do.

If you don’t know what you’re doing, well, that’s what Visual Basic’s for.

Originally posted by Bob:
knackered, why would it be a lot slower? Given the start address of the vertex array, it’s pretty easy to calculate the address of any vertex. Just multiply the index of the vertex by the size of a vertex and you have the offset. If there’s padding between vertices you just multiply the index by another value (size of the vertex + padding). It doesn’t involve any extra work. There may be some problems with the cache though since the size of a vertex is larger (including the padding), but certainly not a lot slower.

I’m assuming the driver won’t want to send redundant bytes over the bus, in which case it would have to strip out the padding - but I’ve just realized that it wouldn’t do this because the stride could be being used to skip other attribute data addressed by another gl***pointer() instruction.
There’s a point - how does the driver deal with interleaved vertex arrays?
glVertexPointer(pointer=base+0, stride=6)
glColourPointer(pointer=base+3, stride=6)
glDrawElements()…
At drawelements time, does it:
a) run down each array pointed to copying the elements to video ram, skipping over 6 elements as it goes,
or does it
b) do 2 block copies, therefore copying redundant colour bytes when doing the verts, and redundant vertex bytes when doing the colours?
Surely it does a). In which case it doesn’t matter what the stride is, it can’t do a straight block copy…

I’ve always assumed that the sum of attribute sizes is checked against the vertex stride in interleaved vertex array setups. If both are equal the driver can shove it down the bus back to back (hello, Mr DMA transfer).

If you have gaps (real gaps, ie padding), I don’t think the driver can initiate a clean transfer, no matter what. It will have to reshuffle data and that’s bound to be slow.

Of course you won’t notice any difference if you only use glArrayElement but then your performance will suck anyway.

As long your classes or structs do not have a Polimorfic members they are kept in memory tighted as their members are described. I use it in my work (writing Operating Systems and Drivers) all the time.

[This message has been edited by OldMan (edited 11-10-2002).]

Originally posted by OldMan:
As long your classes or structs do not have a Polimorfic members they are aligned to memory. I use it in my work (writing Operating Systems and Drivers) all the time.

By polymorphic members you mean something like struct { bool a; int b; float c; };? ('cause in that case, b isn’t just next to a in memory)

If what you’re saying is right, then I didn’t know that the variables in a structure were contiguous in memory when they’re all of the same type.

I still think that, even if you’re right, it’s dangerous to rely on such low level mechanisms when you can do it in another way. (If you’re not sure about what you’re doing, don’t do it.)

[This message has been edited by GPSnoopy (edited 11-10-2002).]

Whats polymorphic about bool, int, float?

Although ideally you shouldn’t use “bool” in cross platform (or cross library), as I’ve seen it different sizes in different versions of Visual C++.

I thought by polymorphic, he meant virtual items, which require the use of a V-table in the class/struct, which causes the sum of the members to not equal the size of the struct/class.

AFAIK member variables/class fields are always in the same order that they are declared in, or it would be impossible to reference the same struct from 2 different compiled pieced of code.

The issue at hand is weather 3 floats in a row, (Vector3 struct/class) get padded to a larger byte boundry because of “compiler optimizations” and the target hardware. They shouldnt at this moment, but it’s not illegal for the compiler to do so.

Nutty

By polimorfic I mean virtual functions.

If you have a struct{float a;byte b;float c;}
you will have it using exaclty the same space as 2 floats and a byte (all toghether in memory). Just forcing alignment you will get all with in 32 bit boundary (each compiler has its own strange flags to do that).

It is ilegal for the compiler to do it without explicit command, since doing it would result in failture to standards. If a compiler decided to do it by himself, all cross platform capability of C++ is gone.

And that is not dangerous…or how do you think drivers are made (in modern hardware) ?

Just something important: a struct {} will not get size 0, it will get size 1, since no 2 objects can be hold in the exaclty same space in memory.

directly from the C++ book (3rd Edition), p.102

“The size of an object of a structure type is not necessarily the sum of the sizes of its members. This is because many machines require objects of certain types to be allocated on architecture dependent boundaries or handle such objects much more efficiently if they are.”

As I understand it, the standard doesn’t make any guarantee on the size of a structure. And struct { char a; float b; }; might have a size of 9 or 12 depending on the compiler/platform.

I knew about the “empty” structure size.

Anyway, I still maintain it’s a bad idea.
I mean, is it really that hard to use a simple array of floats in that case?

Every member of a struct will be padded to match the platform the compiler is targetting. This way, every member’s address will aligned properly.

Example in a 32 bits compiler:

struct{
float a; // 32 bits
byte b; // 32 bits, although it only uses 8
float c; // 32 bits
}

Example in a 64 bits compiler:

struct{
float a; // 64 bits
byte b; // 64 bits, although it only uses 8
float c; // 64 bits
}

So if your programming in x86 and using floats or doubles you won’t be padded .

At least this is what I’ve been assuming, and this may not be the standard way of doing things.

If you’re using msvc just use the #pragma pack stuff and you can be sure…

If you have gaps (real gaps, ie padding), I don’t think the driver can initiate a clean transfer, no matter what. It will have to reshuffle data and that’s bound to be slow.

Even if there’s data “not being used” between your vertex elements, why would this mean the driver has to re-pack the data? The card can just DMA a large chunk, and pick out the parts it needs. As long as there’s data on a physical page of memory, you know that it’s safe to read that entire physical page.

I wonder if the GL standard actually requires you to have padding in the LAST vertex, if you specify padding. Else you could set up a contrived case where it wasn’t safe to read the padding “after” the last vertex in an array, because that might be on a different page boundary than the actual vertex data. Gack!

I guess this is why driver writers either make us specify exactly what memory range we’re dealing with (vertex_array_range, vertex_array_object) OR just build their hardware so that it fetches power-of-two-aligned cache lines (and only fetches those cache lines that actually contain data, not just padding).

Regarding struct padding, most architectures will pad “as you expect” when you sort your members from largest to smallest. Or, if your members are all of the same type, most architectures will pack the members fully. The only one I know about that doesn’t is a DSP which puts everything on a 32-bit boundary (even if it’s a char) because that’s the only type it actually implements…

LostInTheWoods: On a 32-bit system, you’re probably safe; on a 64-bit system, you may well get an extra 32 bits of padding at the end of the structure. To be safe, use sizeof(PointStruct) etc as the stride.

t0y: The situation is a bit more complex than your example suggests. If you have fields of differing sizes, then a smaller field may be followed by sufficient padding to ensure that the following field is aligned accordingly.

However, consecutive fields of the same type typically won’t have any padding between them. E.g. assuming a 32-bit system, for:

struct {
	char c;
	int i;
}

there would typically be 3 bytes of padding between c and i, but for:

struct {
	char c1, c2, c3, c4;
	int i;
}

there wouldn’t be any padding.

struct{
float a; // 32 bits
byte b; // 32 bits, although it only uses 8
float c; // 32 bits
}

for sure that is not correct with Visual C++, GCC or Intel C++ compiler… or all drivers I made until today would not work.

Architecture dependent boundaries in x86 is 8bit… not 32…

32 bits packages are faster than 24 bit ones to move… but not faster than 16 bits or 8 bits in a x86 processor.

[This message has been edited by OldMan (edited 11-10-2002).]